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Preface 


One of the outstanding pioneers in the development of quantum theory 
and particularly its applications to atoms, molecules, and the solid state is 
Professor John Clarke Slater. He has published more than one hundred 
original papers in the field, and his achievements are so fundamental and 
numerous that there is hardly any area of the quantum theory of matter which 
is not basically influenced by his work. In addition, he has been an excellent 
and stimulating teacher, and his series of textbooks has proven to be of 
essential value in universities all over the world. 

His many students, colleagues, and friends felt the need to honor him at 
this stage of his scientific career with a special birthday volume dedicated 
to the field of atomic, molecular, and solid-state theory in recognition of his 
fundamental discoveries and achievements. Since, for practical reasons, the 
book has to be limited in size, only a selected number of papers could be 
included, but the volume is still a tribute from all of us who have benefited 
from his work and friendship. The volume is striking evidence of the stimu¬ 
lating influence on the current research in the field of Professor Slater’s work, 
and also of the impact it will have in the future. 

Even though more than forty years have now passed since Professor 
Slater published his first scientific paper, there are no signs that he is ap¬ 
proaching an age at which many would prefer to retire, and he is as active 
in teaching and research as ever before. With this birthday volume, his many 
students, friends, and colleagues enclose warmest wishes for many happy 
years to come. 

Per-Olov Lowdin 


September, 1966 
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John Clarke Slater, a Biographical Note 
of Appreciation 


PHILIP M. MORSE 

MASSACHUSETTS INSTITUTE OF TECHNOLOGY, CAMBRIDGE, MASSACHUSETTS 


I first met John Slater on the morning of the day of Karl Compton’s 
inauguration as President of in June of 1930. He looked more like a 

young graduate student than the newly appointed head of the Institute’s 
Department of Physics. As he drove us, in an elderly, open car, around 
Cambridge, he talked enthusiastically about Compton’s plans to emphasize 
basic science at the Institute and his own plans to reorganize the department. 
By the time we returned from Europe, a year later, many of these plans were 
being implemented and the Department was in the process of becoming one 
of the outstanding departments of physics in the country. 

Slater was fitted, in many ways, to lead this rapid progress. He grew up in 
an academic family, his father being head of the English Department at the 
University of Rochester. After receiving his bachelor’s degree from Rochester 
in 1920 he came to Harvard as a graduate student, in time to receive solid 
training in classical physics under Bridgeman, and later, to be on hand to 
help lead the explosive development of the new physics, following the break¬ 
throughs of Schrodinger, Heisenberg, and others. In 1922 he went to England 
and to Copenhagen, where he worked with Bohr, returning in 1924 to Harvard 
as instructor and assistant professor. Already his interest had turned to 
problems of atomic and molecular structure; his papers on the ground state 
of helium, on screening constants for light atoms, and on the calculation of 
equations of state appeared between 1926 and 1930. By then he had begun 
the task of computing the physical and physico-chemical properties of atoms, 
molecules, and solids from the basic equations of quantum mechanics, which 
has held much of his interest since. 

After another year at Zurich and Leipzig, in 1929-1930, he returned to 
become head of the department at M.I.T. By the time I returned, in 1931, he 
had reorganized the undergraduate and graduate courses in physics, which 
were to educate a growing number of future leaders in the field. Although, 
even then, the department was among the larger ones in the country in 
regard to size and teaching load, it was still small enough to maintain close 
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PHILIP M. MORSE 


personal contacts. John knew, in detail, what all of us were doing and gave 
encouragement when needed. Daily teas in the Moore Room served to bring 
the graduate students and faculty together, and the Slaters’ small house, 
overlooking the Charles River beyond Lars Anderson bridge, served as a 
social center for department members. To us newcomers, John and his 
gracious wife, with their many contacts at Harvard, eased our introduction 
to the peculiar attractions of Cambridge life. 

Although he spent much time planning and teaching undergraduate 
(including freshman) courses, Slater found time to continue his own research 
and to supervise the research of students. He supervised the bachelor’s thesis 
of Richard Feynman and the doctoral thesis of William Shockley, among 
others. A visit to Slater’s office would, often as not, find him typing out his 
next paper or integrating out a Hartree equation—by hand, of course, since 
electronic computers were only a gleam in the eye of Vannevar Bush in those 
days. Mention of a few titles of his papers in the Physical Review, “ Molecu¬ 
lar Energy Levels and Valence Bonds” (1931), “Electronic Energy Levels 
in Metals” (1934), “The Ferromagnetism of Nickel” (1936), “The Nature of 
the Superconducting State” (1937), indicate the nature of his ground-breaking 
during the thirties. As was aptly said by one who should know, “ the invention 
of the transistor was the outgrowth of the pioneering theoretical work of 
John C. Slater and a few other academicians in solid state physics.” And, 
between papers, he and Frank wrote, and tested out in class, their perennially 
useful texts on theoretical physics. This was one of the first undergraduate 
courses in theoretical physics specifically preparatory to quantum mechanics. 
The fact that, even now, undergraduate physics at the Institute is taught by 
professors, not teaching assistants, is due to Slater’s insistence that teaching 
improves research and vice versa. 

In the years 1939 and 1940, when we turned to defense research, Slater 
was active in organizing the Radiation Laboratory, which carried on the 
development of radar, initiated in Britain. Slater took the theory of the 
magnetron as his assignment, thus applying his experience in the theory of 
electron waves in metal lattices to the theory of plasma oscillations in periodic 
structures. When Bell Telephone Laboratories organized to design and pro¬ 
duce magnetrons, Slater transferred his activities to New York, where his 
principles of circuit design enabled magnetron power to be increased nearly 
a hundredfold, earning him a Presidential Certificate of Merit at the end of 
the war. 

In 1946 he returned to M.I.T. eager to organize the department’s extension 
in the rapidly growing field of nuclear physics, persuading Weisskopf and 
Zacharias and many others to move from Los Alamos to Cambridge. But 
his own research returned to solid-state physics, as evidenced by the series of 
papers and books, familiar to us all, which appeared during the fifties. He 
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organized and directed the research program of the Solid-State and Molecular 
Theory Group, which has attracted many graduate students as well as post¬ 
doctoral fellows and visiting professors. 

In 1951 he turned over his administrative responsibilities as department 
head to N. H. Frank, was appointed H. B. Higgins Institute Professor of the 
Solid State, and concentrated on his research and on fostering the research 
of others. By the years 1960 and 1961, this work culminated in the organiza¬ 
tion of the interdepartmental Center for Materials Science and Engineering, 
now occupying a new building, completed in 1965. Though active in its con¬ 
ception and materialization, Slater preferred to relinquish its direction to 
others. By the time of its completion he had entered into an arrangement 
whereby he spends winters at the University of Florida and summers at the 
Institute. In both places his continued interest in solid-state research continues 
to inspire students and colleagues. 



John Clarke Slater 
His Work and a Bibliography 


ROBERT S. MULLIKEN* 


Slater began his scientific career [see bibliography (2,12)] with his Ph.D. 
research with Bridgman, working on the experimental investigation of the 
compressibilities (using single crystals) “ of twice as many of the alkali halides 
as had been previously measured and (seeing) how much information these 
data are capable of yielding about the forces in the crystal.” Besides getting 
better data than before, he was able to determine the crystal potential energy 
as a function of volume at 0°K and to break this down sharply into a sum of 
Madelung interionic cohesive energy plus energies of repulsive forces for 
which he obtained series expansions. This early work keynotes Slater s lifelong 
interest in the empirical structure of matter and its theoretical explanation. 
He finished his Ph.D. work just two or three years before quantum mechanics 
burst over the horizon, and for the next few years he participated first in the 
efforts [see many of Papers 1-16] which led up to quantum mechanics (efforts 
to understand dispersion theory, spectroscopic transition probabilities, inter¬ 
pretation of spectra, radiation theory, etc.), and thereafter in the application 
of the then new quantum mechanics to the chemical, spectroscopic, mag¬ 
netic, and other properties of atoms, molecules, and solids. Slater constantly 
emphasizes and analyzes the interrelations between experiment and theory, 
and between physical and chemical ideas and mathematical formulations. 

Slater’s papers in the period from 1928 to 1933 penetrated, in many direc¬ 
tions, into the major problems of atomic, molecular, and metallic structures, 
and of interatomic and intermolecular interactions. Thus, his paper On 
the Normal State of He” (20) includes a theoretical derivation of an approxi¬ 
mate formula for the interaction energy of two He atoms (van der Waals 
attraction and closed-shell repulsions). Later, with Kirkwood (27) (who came 
to work with Slater as a postdoctoral fellow), the work was extended to calcu¬ 
lations of the van der Waals cohesive forces for a number of gases. A further 
development appeared in “The Quantum Theory of the Equation of State” 

(29) for imperfect gases. . 

A “ Note on Hartree’s Method ” (24), following an analysis (19) of Hartree s 

* With the co-operation of Per-Olov Lowdin and James Conklin. 
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self-consistent-field method, gives briefly a justification of the method in 
terms of a variational approach to the solution of Schrodinger’s equation, and 
also states briefly the idea of what is now called the Hartree-Fock method 
(Fock’s paper appeared almost simultaneously). Few ideas have turned out 
to be more fruitful for atomic theory. In his paper “Theory of Complex 
Spectra ” (22) Slater introduced an extremely practical approach, inacritiqueof 
Hund’s rule, to the calculation of the energy separations among the different 
L and S states of an atomic electron configuration; he wrote antisymmetrized 
functions using the now well-known “Slater determinant” forms, and broke 
up the Coulomb and exchange integrals into linear combinations of his now 
well-known Fand G integrals. The “ Slater determinants ” are now one of the 
most useful tools in the quantum theory of matter. In his paper on “ Atomic 
Shielding Constants” (25) he introduced the specifications for the extremely 
useful approximations to atomic orbitals which are now known as “ Slater 
orbitals,” and in his paper “Analytic Atomic Wave Functions” (32) he 
discussed clearly how SCF (self-consistent-field) orbitals differ from H atom 
orbitals, and showed how Hartree’s numerically tabulated SCF orbitals can 
be fitted by analytical expressions which are linear combinations of what are 
called “Slater-type orbitals” (STO’s), namely, Slater orbitals scaled up or 
down in size. This idea has been used later by Roothaan in his LCSTO 
molecular-orbital SCF wave functions for molecules, in the Hartree-Fock- 
Roothaan method. Slater’s paper (23) “Cohesion in Monovalent Metals” 
begins by comparing the Heitler-London and LCAO molecular-orbital 
methods for H 2 , discussing the need for adding ionic to the Heitler-London 
atomic functions. A more detailed discussion of this important paper is given 
elsewhere in this volume. 

In a paper (26) on “Directed Valence in Polyatomic Molecules” Slater, 
independently of Pauling [J. Am. Chem. Soc. 53, 1367 (1931)] discussed 
directional effects of Heitler-London bonding in atoms of the types F, O, 
N, and C, and pointed out that where the valences come from p electrons, they 
should tend to be directed at right angles, while the four valences of C have 
tetrahedral symmetry through formation of tetrahedral sp hybrid valence 
electron orbitals. Slater in this paper introduced and used the criterion of 
maximum overlapping. He also discussed polar character in molecules like 
HC1, and introduced the concept of the stabilization of benzene by the mixing 
of the two Kekule wave function structures (later extensively discussed by 
Pauling as quantum-mechanical “resonance”). This paper, which contains 
many photographs of molecular models, makes use of methods previously 
employed by Slater in the paper on “ Complex Spectra ” (22) and also applied 
in the paper on “Cohesion in Monovalent Metals” (23) which initiated his 
work in valence theory. In a paper (28) on the structure of the groups X0 3 , 
he shows how the pyramidal structure of such ions as CIOJ on the one hand 
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and the planar structure of those like NOJ can be understood in terms of the 
directional properties of valence. The paper on “ Molecular Energy Levels and 
Valence Bonds” (30) developed more fully and extends the ideas presented 
in paper (26). Slater’s paper “The Virial and Molecular Structure” (33) 
called attention for the first time to the importance of applications of the 
quantum-mechanical virial theorem for the understanding of molecules. In 
some of his papers, Slater has discussed the sizes of atoms, including tables of 
atomic radii, simple formulas for estimating the size of atomic radii from 
Slater orbitals, and instructive comparisons between sums of these atomic 
radii and observed interatomic distances in molecules and crystals. 

In his later work, Slater has continued to exploreinmanydirections methods 
for the theoretical understanding of the properties of atoms, molecules, and 
solids, the emphasis in his later work as in his earliest work being on crystal¬ 
line solids, and, in this area, his contributions are as numerous and important 
as those which have already been mentioned. 

With his remarkable ability to correlate physical truth with the models 
used to understand it, he was among the first to clearly demonstrate the 
failures of the free-electron theory of metals in explaining their electronic 
properties, at the same time pointing out the areas in which the theory did 
correctly depict the properties of the conduction electrons. In Paper 36 he 
showed that the electrons in a metal have atomic, rather than free-electron, 
properties near the nuclei, but that, away from the nuclei, their wave functions 
might have a strong resemblance to free-electron functions. He further pointed 
out that, for sodium, the energy as a function of k is very similar to a perturbed 
free-electron energy curve, thus explaining the success of the free-electron 
model for that metal. More recently he has carried these ideas even further, 
showing specific conditions which must be fulfilled by the electronic wave 
functions in order for the energy bands to be free-electronlike. He has 
illustrated [see, for example, bibliography (775)], the manner in which these 
conditions are, in fact, fulfilled for the alkali metals—notably free-electronlike 
—and how they are not fulfilled in some of the metals for which the electrons 
are not describable at all by the free-electron model. 

With Slater’s understanding of the inaccuracies of the free-electron model 
of metals, it is not surprising that he has had an active part in the development 
of better methods for the calculation of electronic properties of solids. His 
publications include calculations by many methods, including the Thomas- 
Fermi statistical model and the Wigner-Seitz method, but it was Slater him¬ 
self who proposed the method which is today one of the most widely used 
approaches to the calculation of electron energy bands in solids the augmented 
plane wave (APW) method. Originally proposed in 1937 (44), and described 
in more detail and from other points of view in later publications, the method 
is based on Slater’s realization that the wave function of an electron in a solid 
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is very atomlike near the nuclei, but more like the plane waves of free electrons 
away from the nuclei. Therefore, why not use an expansion which also had 
this characteristic, to obtain rapid convergence? Thus, in the APW method, 
the wave function is constructed from functions which are plane waves in the 
regions outside of imaginary spheres surrounding the nuclei, and atomiclike 
solutions of a spherically symmetric potential within these spheres. This 
approach eliminates the slow convergence of pure plane-wave expansions and 
the many-center integrals which cause so much difficulty with LCAO ap¬ 
proaches. It is still a difficult problem for hand calculation, but quite feasible 
on modern high-speed digital computers, and it is currently being used in 
theoretical solid-state groups in many countries of the world as a technique 
for the advancement of our knowledge and understanding of the properties 
of solids. 

The band theory of solids is, of course, interesting only insofar as it is 
useful in explaining properties which can be measured experimentally, and 
Slater has taken a leading role in the efforts to understand many of these 
effects. In addition to developing new ideas, he has used the older more 
approximate theories to teach all they are able about the physics he seeks to 
understand, often finding bits of truth upon which can be built more accurate 
models. In this way he used the older theories of conduction to pave the reader’s 
way to an understanding of the more difficult but far better quantum-mechan¬ 
ical description of conduction in his paper (37). Slater has not hesitated to 
extend or discard approaches which he felt were not truly descriptive of the 
important physical aspects of the problems he sought to understand, as is well 
illustrated in his series of papers on ferromagnetism in metals. As he felt 
that the Heisenberg model was not suitable for describing ferromagnetism 
in metals, he showed that the very limited band theory of ferromagnetism 
which Bloch had developed could be extended to account for the important 
experimentally observed magnetic phenomena and still be consistent with the 
other ideas of electronic properties of solids which had been so well explained 
by the band model. His treatment of spin waves in magnetic materials also 
led him to an idea concerning possible semilocalized states which would 
explain the huge diamagnetic effects encountered in superconducting solids, 
and so his name is also counted among those whose work had a part in the 
understanding of superconductivity (43). 

In other areas of solid-state physics he anticipated concepts which were not 
to become prominent or familiar to specialists in the field until several 
decades later. For example, in paper (45), we find a discussion of “ Damped 
Electron Waves in Crystals.” By introducing an empirical damping constant 
in the form of a pure imaginary term added to the real crystal potential, he 
was able to describe the inelastic damping of electrons observed in electron- 
diffraction experiments. The complex or “optical” model for a potential 
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has, of course, been a popular one in both optics and nuclear physics. It has 
been only very recently that solid-state theorists have looked seriously into 
the problem of defining electron states in disordered crystals and liquid 
metals. They have quite independently arrived at the description of the damp¬ 
ing of Bloch states due to the effects of disorder in terms of a complex or 
“optical” model for the crystal potential. They have thus rediscovered in a 
fashion a concept initiated under somewhat different circumstances by 
John C. Slater back in 1937. 

Besides being the author of numerous research papers, Slater is the author 
of a number of valuable books both on general physics and on chemical 
physics (33,52,56,62,63,70, 77, 102,109,111, 113). The most recent series of 
volumes (109, two volumes, and (111), two or more volumes) are intended by 
Slater to summarize and digest the content of his research papers and other 
work on the structure of matter from the point of view of a chemical physicist. 

There were and are many other areas of interest in which the name of 
Slater is synonymous with progress in physical understanding. Not only this, 
but also there are now many of Slater’s students and an increasing number of 
his students’ students in whom there has been imbued some of the same 
spirit of seeking the physics behind the model, and seeking a model that 
describes the physics. This may well prove to be as great a contribution to the 
physical sciences as his own personal findings. One begins to appreciate the 
real magnitude of his contribution to physics and chemistry when one dis¬ 
covers how many of those doing creative work in these fields today must count 
some form of association with John Clarke Slater as having had a significant 
effect on their own creativity. 
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Comments on Professor J. C. Slater’s Paper 
“Cohesion in Monovalent Metals” 


PER-OLOV LOWDIN 


Among the scientists contributing to the Slater volume, it was felt that, in 
order to give a more complete picture of Professor Slater and his scientific 
personality, it would be desirable and appropriate to enclose a reproduction 
of one of his own classical papers. There is an early paper written in 1930 
about “Cohesion in Monovalent Metals,” a subject which has played a 
fundamental role in the development of molecular and solid-state theory. 
This paper has turned out to be a gold mine of fruitful ideas which, many 
decades later, is not yet completely exhausted. It is reproduced here as a 
typical representation of Professor Slater’s writing. 

The paper deals with the situation of the electrons in an alkali metal. In 
order to give a simple model of the metal, Professor Slater first discusses the 
hydrogen molecule in great detail. He studies, for the first time, the interrela¬ 
tion between the valence bond method and the molecular-orbital method, 
and he shows that the former, including ionized states, gives the same result 
as the latter including superposition of configurations. His discussion of the 
behavior of the energy curves for separated atoms is basic for later studies of 
the correlation problem. He uses the results obtained to investigate the elec¬ 
tronic structure of a metal and its cohesive, electric, and magnetic properties. 

The paper contains a discussion of the spin-degeneracy problem which is 
a landmark in the development of this field. After a discussion of “spin 
waves,” Professor Slater discusses such spin arrangements with “ different spins 
on different sublattices” which have proved to be fundamental for the 
alternant molecular-orbital method and for the modern treatment of the 
correlation problems using “ different orbitals for different spins.” 

In discussing the atomic model of a solid, he emphasizes the importance 
of the overlap integrals between atomic orbitals associated with neighboring 
centers and the resulting “ nonorthogonality catastrophe. He also indicates a 
solution of this problem through the equivalence between the valence bond 
method and the molecular-orbital method, which later was realized in the 
construction of the Wannier functions. 
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The focus of the paper is on the physical problem and the associated model 
but, even if the formal mathematics and the number of equations is kept 
strictly to a minimum, the mathematical structure underlying the general 
discussion is strict and clear and brought to life through the simplified models 
and examples used. 

The paper is a beautiful and typical example of Professor Slater’s clear and 
pedagogical style in treating even some of the most difficult phenomena in the 
quantum theory of matter. The importance of its influence on the development 
of the theory through more than three decades can hardly be overestimated. 


COHESION IN MONOVALENT METALS 


By J. C. Slater 

Jefff.rson Physical Laboratory, Harvard University 
(Received January 27, 1930) 

Abstract 

The theory of metallic structure, of Sommerfeld, Heisenberg, and Bloch, is 
carried far enough to explain cohesive forces, and calculations are made for atoms 
with one valence electron, particularly metallic sodium. The numerical results, 
though rough, are in qualitative agreement with experiment. It is found that the 
forces in general are of the same nature as those met in ordinary homopolar binding, 
discussed by Ileitler and London; except that the purely electrostatic force from 
penetration of one atom by another is relatively more important, the valence effect 
from the exchange of electrons relatively less important, than in diatomic molecules. 

As a preliminary to the calculation, the relations of the methods of Heisenberg 
and of Bloch are discussed, and it is shown that they are essentially equivalent in their 
results when properly handled. Remarks are made both about conductivity and fer¬ 
romagnetism. In connection with conduction, it is shown that a definite meaning can 
be given to free electrons, that they are necessary to conduction, and that a method 
can be set up for computing their number, which is rather small compared with 
the number of atoms. Ferromagnetism is discussed in connection with a recent 
paper of Bloch. It is shown that a metal like an alkali cannot be ferromagnetic, for 
atoms at such a distance that the interatomic forces keep the metal in equilibrium, 
are too close to be magnetic. For ferromagnetism, rather, it seems necessary to 
have one group of electrons responsible for cohesion, and another group, of smaller 
orbit and therefore relatively farther apart, producing the magnetism; a situation 
actually found only in the iron group and the similar groups. 

I. Introduction 

CRYSTAL of a metal is an enormous molecule, with electronic energy 
* ** levels depending on the positions of all the nuclei, just as the electronic 
energy of a diatomic molecule depends on the internuclear distance. In this 
paper, in which we are interested in cohesive forces, we must find this energy 
of the lowest state in terms of the size of the crystal. We limit ourselves to 
geometrically similar arrangements of the nuclei, with changing scale. From 
the minimum of the curve, we find the heat of dissociation, grating space, 
and compressibility of the metal. But also we can investigate the wave func¬ 
tion of this lowest state, and obtain information about the electric and mag¬ 
netic properties of the metal. In this way we are naturally led to a discussion 
of the calculations of Heisenberg 1 and of Bloch on these subjects; in order to 
be sure that we really understand the arrangement of energy levels, we discuss 
the relationships of their methods, and arrive at a consistent picture combin¬ 
ing them. 

1 W. Heisenberg, Zeits. f. Physik 49, 619 (1928); 

F. Bloch, ibid. 52, 555 (1929). 
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As for the results, one naturally asks first, what are the forces holding a 
metal together? Are they ordinary attractions on account of penetration of 
atoms, or valence forces, or electrostatic forces of ionic attraction, or van der 
Waals forces,- or some special sort not found in other cases? This question 
cannot be answered categorically; no doubt all the forces are simultaneously 
present, and the problem is to find the relative magnitudes. The tentative 
result at which we arrive is that the simple penetration of one atom by 
another is the most important part of the effect. But valence effects are also 
present, although weakened by having the valences shared by many neigh¬ 
bors, and are responsible for a considerable fraction of the attraction. 
Although these actual magnitudes may not be verified by more accurate 
calculation, still we have discussed the problem in enough detail so that the 
general relations can be understood in any case. 

The other question one will ask is, what is the situation of the electrons 
in the metal? Can one give a meaning to the question, how many free elec¬ 
trons are there? The answer, from whichever side we look at the question, 
seems to be the same. Most of the valence electrons are at any time attached 
to their atoms. These electrons cannot take part in conduction; they could 
do it only by having a whole file of such electrons simultaneously jump to 
the next atom in line, a most unlikely occurrence. But a few electrons at 
any time—calculation suggests a few percent—will be detached from their 
atoms, leaving an equal number of positive ions behind them; and they are 
what, by all rights, one should call free electrons. These electrons, and the 
positive ions left behind, can take part in conduction. First, the free elec¬ 
trons can move easily from one atom to the next. Second, a bound or associ¬ 
ated electron on one of the atoms next a positive ion can jump to that ion, 
leaving its own atom ionized. We are thus led precisely to the dual theory of 
conduction, by free and by associated electrons, which Professor Hall 2 
has suggested and elaborated. When we look at the metal by the method of 
Heisenberg, these results become clear. In that method, a wave function 
consists of the assignment of electrons to atoms. We find that we must go 
beyond Heisenberg, in assigning sometimes two electrons to one atom, 
sometimes none, instead of always one; for we need such states to solve the 
problem of the stationary states of the metal. That is, we introduce free 
electrons. And when we consider transitions from one state to another, it 
is easy to see that these transitions can result in conduction only when such 
free electrons are present. On Bloch’s scheme, where we describe directly 
the velocity, rather than the position, of the electrons, it is less easy to see 
the relation; but here too one can show that, if there are no free electrons, the 
velocities of all electrons must compensate, so that there is no net current. 
Since this paper is not primarily about conduction, we do not go into these 
points with any detail. 

The only metals specifically treated are those with one valence electron 
per atom, and that in an s state; that is, the alkalies. And it is assumed that 
they can be replaced by single valence electrons moving in non-coulomb 

1 E. H. Hall, Proc. Nat. Acad., 1920-1921. 
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fields. This can be easily justified. It is to be noted that the other metals 
are more complicated, not merely by having more electrons, but by having 
them in p or d orbits, thus introducing new degeneracies. The actual cal¬ 
culations of cohesion have been carried through for sodium, with satisfactory 
results. They are only done roughly, however; the primary purpose of this 
paper is to make clear the general relations, rather than to attempt accurate 
calculations. The work is being carried further by Dr. Bartlett, and I wish 
to thank him for help on some of the calculations used in this paper. The 
work described here has been done while the writer was on leave, working in 
Leipzig. He wishes to thank Professor Heisenberg for his courtesy in extend¬ 
ing the privileges of his laboratory, and for a number of illuminating con¬ 
versations on the subject of the paper; and also to thank Harvard University 
for granting leave, and the Guggenheim Foundation for the assistance of a 
fellowship. 

2. Comparison of Heisenberg’s and Bloch’s Methods 

The problem of a metal must be attacked by perturbation theory, and 
the unperturbed functions which we use can be set up in two quite different 
ways, one used by Heisenberg, the other by Bloch, either giving us a finite 
set of unperturbed functions. We regard the perturbation problem in the 
following way: we seek those linear combinations of these functions which, 
in the sense of the variation method, form the best approximations to solu¬ 
tions of Schrodinger’s equation. This problem is solved by computing the 
matrix of the energy operator with respect to these functions, and solving 
the equations 

Y,(H{i/k)-b(i/k)W)S{k) = 0 

k 

for the coefficients S(k ) to be used in making the linear combinations, 
and the energy values W of the resulting terms. (The term d (i/k) must be 
given a slightly different form if the unperturbed functions are not orthogo¬ 
nal). This differs from the more conventional method: there one starts with 
an infinite, complete set of unperturbed functions, instead of our finite set, 
but solves only as a power series in the non-diagonal terms of the energy 
matrix, breaking off after the second power in all ordinary applications. 
It resembles more closely the quite different method ordinarily used with 
degenerate systems, where one takes only very few unperturbed states, but 
correctly solves the problem of combining them. For a nearly degenerate 
problem like the present one, with a great many states near together, the 
conventional method of developing in series will not work well, for the series 
do not converge well, and we are forced to use something like the present 
method. The justification comes simply from the assumption that the lowest 
states can be well approximated by such a linear combination of Heisenberg's 
or Bloch’s functions (which correspond to having the atoms in their normal 
states). Surely this is not exact; for better results we should have to consider 
also the excited states of the atoms. But also certainly it is a fair approxi¬ 
mation for the lowest states of the metal. 
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Heisenberg’s functions, amplified in a simple way, form good approxima¬ 
tions when the crystal is extended, for they are derived from the separated 
atoms. Bloch’s functions on the other hand come by analogy with the free 
electron theory of Sommerfeld, and are good approximations when the 
crystal is compressed. The actual solutions of the perturbation problem are of 
course linear combinations of either Heisenberg’s or Bloch’s functions, not 
individual ones, and one gets the same final result whichever set one starts 
with (for the two sets of functions can be written as linear combinations of 
each other). But the fact that in the limiting cases the functions of one of the 
two sets become rather good approximations can be used, along with inter¬ 
polation, to derive the general nature of the real stationary states. This 
comparison is made in the present section, and is illustrated by the interesting 
case of H 2 , where the calculations can be made exactly. At the outset, we 
must recognize two facts: first, that we must amplify Heisenberg’s method 
by including polar states, to make it general enough to agree with Bloch’s 
and to permit conductivity; second, that although Bloch has the proper set 
of functions, he has nowhere attempted to solve the perturbation problem, 
but has merely taken his unperturbed functions as being correct, which 
amounts to getting the energy to the accuracy of the conventional “first 
order perturbations.” 

The first step in either Heisenberg’s or Bloch’s method, as we apply them, 
is to write an approximate solution as a product of functions of the individual 
electrons. Heisenberg takes, for these separate functions, the wave functions 
of electrons attached to individual nuclei; the number of such functions is 
the product of the number of nuclei, multiplied by the number of different 
sets of quantum numbers we consider for an individual nucleus. If we 
restrict ourselves to s states, there are then only two states per nucleus, 
corresponding to the two orientations of the spin. For nucleus a, we denote 
these two* by u a (a/x), ug(a/x), and we have such a function for each nucleus 
a b ... n. Bloch takes, on the other hand, combinations of these functions: 

U a {klm/Xyz) = lGi+ mg,tc >) Ua ( g | g , 2g3 / y vs ), 

0 1 Pa 0 > 

where gig 2 g 3 are the coordinates of a particular nucleus, G 1 G 2 G 3 the dimen¬ 
sions of the rectangular crystal, and u a (gig 2 g 3 /xyz) the wave function (as 
used by Heisenberg) for an electron moving around the nucleus situated at 
gig 2 g 3 - The function with k, l, m represents an electron, in general moving 
in the direction k, l, m, but pausing at the various atoms on the way. There 
are as many sets k, l, nt allowed as there are atoms in the crystal; for larger 
k, l, m the function proves to be merely a repetition of one already counted. 


3 We use here for convenience in writing Pauli's notation u a , Ug for the spin, rather than 
the more explicit but more cumbersome notation u(n/x,) 5(»»,/»»„•) used in a previous paper. 
See J. C. Slater, Phys. Rev. 34, 1293 (1929). The method used in the present paper is described, 
as applied to atoms, in the paper referred to; it should be understood that, although we speak 
here of using Heisenberg’s and Bloch’s methods, our actual procedure is quite different from 
that of these authors. 
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Now we actually set up the product of functions mentioned in the 
previous paragraph: we pick one out and let it be a function of the coordin¬ 
ates #i of the first electron, a second for the coordinates X 2 of the second, and 
so on to the nth, and multiply them ail together. By the exclusion principle, 
no function can be chosen more than once. Then we form an antisymmetric 
combination, by permuting the indices of the electron coordinates, and add¬ 
ing the permuted functions with appropriate signs, obtaining essentially a 
determinant. These antisymmetric functions are the ones with which we 
start our perturbation calculation. Many such functions can be set up: 
there are 2 n functions of a single electron, of which only n are to be chosen 
for each antisymmetric function, so that there are (2«)!/(n!) 2 different 
functions. Our perturbation problem is that of finding which linear combin¬ 
ations of these functions most nearly satisfy the wave equation. We may note 
the restriction of Heisenberg’s method as he uses it; he does not include polar 
states. That is, he does not allow for example the two functions u a (a/x), 
ug(a/x ) to appear together in any product. This greatly limits the number of 
functions; but although the terms obtained by it certainly represent the 
lowest energy levels, since it requires energy to form a positive and a nega¬ 
tive ion from two neutral atoms, we do not make this limitation. 

Having set up the unperturbed functions, we next make linear combin¬ 
ations of them, by the method described in a previous paragraph. This 
process can be simplified by using a property of the spin. Every unperturbed 
function has a certain definite component Ms of spin along the axis, equal to 
(n a — ng)/2, where n a is the number of electrons with positive component of 
spin, tig the number with negative. If now we neglect the magnetic inter¬ 
action between the spins and the orbital motion, the problems with each 
value of Ms can be handled separately: the components II(i/j) from a 
function with one value to a function with another are zero. The states with 
a given Ms include, as one readily sees, all those states whose total spin S 
is equal to or greater than Ms (for just these 5’s can be so oriented, on the 
vector model, as to give a component Ms along a fixed axis). Thus by solving 
each such problem, and comparing, we can identify the spin of each term. 4 

The two methods can be illustrated by the case of Ho. Here there are 
4!/(2!) 2 = 6 different wave functions. On Heisenberg’s method, the four 
functions for an individual electron can be symbolized by (aa), (/3a), ( ab ), 
(/3 b ); two of these are to be picked out for each antisymmetric wave function. 
Thus the six are ( aa)(ab ); (aa)(/3a), (ab)Q3b); (aa)(fib), (/3a)(af>); (/3a)(/36). 
They are arranged, first, by Ms: the first has the value 1, the next four the 
value 0, and the last —1. Thus the terms consist of one triplet and three 
singlets. Among the four terms with Ms — 0, the first two are polar (and not 
considered by Heitler and London, or Heisenberg), the last two are non¬ 
polar. Immediately one finds that the sum of these non-polar functions is 
the component of the triplet. We are then left with three functions, the 

* This is essentially the method used in the paper already quoted It has already been 
applied by Bloch to problems in the theory of metals. See F. Bloch, Zeits. f. Physik., 57, 545 
(1929). 
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two polar ones, and the difference of the non-polar ones, from which to find 
our three singlets. The difference of the polar ones is antisymmetric in the 
nuclei, giving one state; their sum, and the difference of the non-polar func¬ 
tions, give two functions symmetrical in the nuclei, between which we finally 
solve the simple perturbation, resulting now in a quadratic secular equation, 
and obtain the two remaining singlet states. The energy levels as a function 
of the distance of separation are plotted in Fig. 1. The energy level of the 
lowest, l XS N , is almost exactly as given by Heitler and London, but its wave 
function contains quite an appreciable contribution from the polar state. 
The triplet is just the repulsive state of Heitler and London. The other two 
levels are essentially polar. They go at infinite separation to the energy of 
H + +H~, greater than the other limit by the ionization potential less the 
electron affinity of H (this rough approximation gives —\Rh for the electron 



Fig. 1. Energy levels of H». 

affinity, so that the terms go to 5/4 Rh). The lower of these has a minimum; 
it is presumably the polar part which, by combination with other functions, 
leads to the experimentally known B state of the molecule. We notice that 
at large separations the functions behave just like Heisenberg’s (extended) 
unperturbed functions: a triplet and a singlet are non-polar, and go to the 
lower energy; while two singlets are polar, and go to the higher level. 

Next we consider Bloch’s method for the same problem. His functions 
for one electron, for this case, are 

««(0/ x) = u a (a/x)+u a (b/ x) 

«„( 1/x) = u a (a/x ) — u a (b/x ), 

with similar functions for /3. (These do not follow quite directly from the 
general formulas given above; Bloch’s functions must be slightly modified 
for finite systems, for they apply rather to infinite but periodic ones.) 
The discussion of multiplicity given above goes through without change, if 
we only substitute 0, 1 for a, b. We can easily show by direct calculation 
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that the resulting unperturbed antisymmetric functions are linear combin¬ 
ations of those found by Heisenberg’s method. For example, for M s = 1 , 
there is only one function by either method, so these must be identical, 
except for a numerical factor. By Heisenberg’s method the function is 

u a (a/xi)u a (b/x 2 ) — u a (b/Xi)u a (a/x 2 ). 

By Bloch’s it is 

u a (0/xi)u a (l/x 2 ) — u a ( 1/ Xi)u a (0/ x 2 ) 

= [u a {a/Xi)+u a (b/xi) ] [u a (a/x 2 ) — u a (b/x 2 ) ] 

— [llaW Xi )—u a (b/ *0 ] [u a (e/x 2 ) + u a (b/x 2 ) ] 

= -2[u a (a/xi)u a {b/x 2 ) — u a (b/xi)u a (a/x 2 ) ]. 

We can set up the whole perturbation problem in these functions; and the 
solution can be carried out as easily as before, leading of course to just the 
same answers. The interesting question now is, how closely do Bloch’s 
individual functions approximate the correct ones, for small values of R ? 
The functions are respectively as follows: a singlet with both electrons in the 
state 0; a singlet and triplet with one in the state 0, the other in the state 1; 
and a singlet with both in the state 1. The state 0 corresponds to the lowest 
vibrational state on Sommerfeld’s theory, the state 1 to the next higher one, 
so that the first state has on the simple interpretation only the zero-point 
vibrational energy, the next two have each one quantum, and the last 
two. Examination of the actual wave functions shows that they agree quite 
closely with the functions of Bloch: the lowest one is made, it is true, by 
combination of the (00) and (11) states, both being S N , but the coefficient of 
the first is about eight times as large as that of the second, when R is such that 
the energy is at its minimum. The next two are made up of the (01) states. 
The highest is about eight parts of (11) to one of (00). The energies also show, 
for high compression, the behavior expected: the two states which should 
have one quantum of vibrational energy draw together, and the one with two 
quanta is just about twice as far above the lowest state as those with one. 
Even the spacing of these levels is just about what would be calculated on 
Sommerfeld’s theory for an electron vibrating in a region the size of the mole¬ 
cule. Thus we see that Bloch’s unperturbed functions form fairly good ap¬ 
proximations to the real functions for the compressed state, as Heisenberg’s 
do for the extended state. 

We can now return to the general case, and make use of the fact that 
Heisenberg’s functions approximate the real wave functions well for large 
separations, Bloch’s for small. First, for the extended system, the energy 
is the ionization energy, on account of having many ions as well as neutral 
atoms. For a metal, it requires about 6 volts to form a positive and negative 
ion from two neutral atoms. Thus if all the atoms were ionized, we should 
have n/2 such pairs, or an energy per atom, or per electron, of about 3 volts. 
This measures the extension of the group of terms, for large R. It is a simple 
problem in permutations to find the number of terms of each multiplicity 
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with each energy value in the limit. The one term of highest multiplicity 
will approach the lowest limit, for large R ; it must be entirely nonpolar, for 
all spins point in the same direction, so that no two electrons can be in the 
same atom. For the next lower multiplicity, only one electron has a reversed 
spin; it is tire-only one which can be in an atom with another electron, so 
that there can be just one pair of atoms ionized. Following out, we easily 
see that terms of lower and lower multiplicity, in the limit of large R, lie 
higher and higher, and at the same time are more and more spread out. They 
spread in such a way that there are terms of each multiplicity way down to 
the bottom limit, although not to the top. As we shall see later, for R large 
but not infinite, in the normal case, the really lowest terms have small spins; 
but near them are many terms with large spin. 

For the compressed system, the arrangement is as given by Bloch’s 
theory. The total extension of the group of terms increases with l/R 2 ', for 
ordinary values of R, it is of the order of the mean zero-point energy, times 
7i, which is decidedly larger than 3 voltsXw. Thus not only do the curves 
tend upward for decreasing R, giving repulsive energy levels, but they are 
definitely doing this at the actual size of the metal. The general physical 
interpretation of this repulsion is obvious: the valence electrons act here 
approximately as a perfect gas, and the energy levels are those of such a gas 
as it is compressed adiabatically against gas pressure, the energy varying 
therefore as F -2/3 or as 1 /R-. Here the terms of high multiplicity lie in the 
center of the pattern; those of lower spin also average in the center, but are 
more and more spread out. Since the terms of high spin are so low for large 
R, but not for small R, they must be even more repulsive than the others. The 
possibility seems very remote that any terms except those with very low 
multiplicity could be so low as to have minima, and come into the question 
for the normal state. We see that for cohesion we are interested only in the 
very lowest fraction of the whole set of terms. These terms almost all will 
go to the lowest energy level at infinite separation; they become in this 
limit non-polar. And the accuracy with which one can compute the lowest 
states of H 2 from Heitler and London’s non-polar functions suggests that 
here too this may be possible. Accordingly for our actual calculation of these 
lowest states, we shall use Heisenberg’s method with only non-polar func¬ 
tions. We shall find here, as we expect from our qualitative discussion, that 
the terms of low multiplicity really do lie below, some of them being attrac¬ 
tive; while those of high multiplicity are repulsive, the highest spins lying 
highest. Finally we shall consider the effect of polar terms, and conclude 
that it is really small on the low energy levels, although not on the wave 
functions; for it is the polar character of the wave functions which makes 
conductivity possible. 

3. Electric and Magnetic Properties 

Conductivity. In the introduction we have mentioned the interpretation 
of electric conduction on Heisenberg’s and on Bloch’s scheme. One notices 
that a single one of Bloch’s functions implies conduction—the diagonal term 
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of the momentum matrix is different from zero—whereas with Heisenberg’s 
functions we must have a continual change from one stationary state to 
another. But it is particularly important to notice that, without polar 
states, or free electrons, no conduction is possible; we cannot set up combina¬ 
tions of non-polar states with a resultant momentum. For example, with two 
electrons, we can set up an arbitrary non-polar function Ciu(a/xi)u(b/x 2 ) 
-\-c 2 u{b/x\)u{a/x 2 ). If now we compute the momentum, whose operator is 
h/2iri{d/dxi+d/dx-i), the only possibly significant terms are the cross terms, 
like 


hr ( d d \ 

—: I u(a/xi)u(b/x 2 )[ -1- )u(b/xi)u(a/x 2 )dvidv 2 

2tt iJ \dxi dx 2 / 

= CiCi -j f u(b/x 2 )u(a/x 2 )dv 2 f u(a/x i) - u{b/x\)dv x 

2irt [a J dxi 

+ J' u(a/xi)u(b/xi)dv t J' u(b/x 2 )—u(a/x 2 )dv 2 j-. 


On account of the penetration of one atom by the other, the integrals 
fu{b/x 2 )u{a/x 2 )dv 2 are not zero. The integral fu(a/xi)(d/dxi)u(b/xi)dvi is also 
different from zero. But it is exactly cancelled by fu(b/x 2 )(d/dx 2 )u(a/x 2 )dv 2 , 
as one can show by Green’s theorem, so that the whole is zero. On the other 
hand, if we set up a polar combination like Ciu(a/xi)u(a/x 2 ) +c 2 u(b/xi)u(b/x 2 ), 
we again get two terms, but now they add, and give a current. As another 
example, we can take the term of maximum multiplicity in any system. In 
this term, we have seen by Heisenberg’s scheme that each atom has just 
one electron, so that we expect no conduction. But in Bloch’s scheme, each 
value k, l, m has just one electron. Since each such value is balanced by one 
with — k, —l, —m, having opposite momentum, the total momentum is zero, 
and there is again no current. 

We can now see the importance of considering exactly the wave functions, 
as well as the energy levels, of the lowest state. In the ordinary low states 
there will, of course, be no current. But near the lowest state, if there is to 
be conductivity, there must be combinations of polar states, having a cur¬ 
rent, which are assumed in the presence of a field, and whose added energy 
comes simply from the kinetic energy of the electrons and the self-induction. 
Such states are possible only on account of the presence of positive and 
negative ions, with the resulting free and associated electron conductivity. 

Magnetism. The lowest state of H 2 is the non-magnetic l £, and we have 
found such a situation in general. In the region where the lowest states have 
their minimum, the metal must surely be in a compressed state, Bloch’s 
arrangement of energy levels must be a good approximation, and the states 
of large spin must lie very high. We are thus led to the quite general conclu¬ 
sion that the outer electrons, which are largely if not entirely responsible 
for both cohesion and conduction, cannot produce ferromagnetic effects. If 
a metal is to be ferromagnetic, there must then be other electrons than these 



26 


J. C. SLATER 


outer ones which are responsible for it, and these others must have smaller 
orbits, so that at the equilibrium distance of the outermost ones, the inner 
ones will be relatively further apart, and can be treated as an extended 
rather than as a compressed system. It is a very attractive hypothesis to 
suppose that in the iron group the existence of the 3d and 4s electrons provides 
in this way the two electron groups apparently necessary for ferromagnetism; 
for it is only in the transition groups that we have two such sets of electrons, 
and this criterion would go far toward limiting ferromagnetism to the metals 
actually showing it. 

We next ask just how such inner electrons could be ferromagnetic. Cer¬ 
tainly the general trend of the terms of high spin to the low energy values 
at large R is an essential part of the question: there will be terms of large 
spin near the lowest level. Bloch 6 has discussed the problem, concluding 
that for large R's the terms of high spin actually lie lower than those of 
smaller spin (he does not specifically discuss the dependence on R, but his 
energy formulas all contain it parametrically). This conclusion, however, is 
not correct; Bloch has merely computed diagonal values of the energy, with 
respect to his functions, and for large R these by no means form approxima¬ 
tions to the actual energy values. From the correct treatment of the problem 
as we have given it, it is plain that at all R's there are terms of low multiplicity 
as low as those of high spin, or lower. It may be, however, that the mere 
presence of so many low terms of high multiplicity may be enough, on ac¬ 
count of their high a priori probability and large number, to insure that the 
terms of large spin should be well represented at ordinary temperatures, 
even though there are low terms of zero spin, and so produce ferromagnetism. 
If, however, this should prove on calculation not to give the right effect, we 
should be led to consider Heisenberg’s assumption that the normal order of 
terms is inverted in ferromagnetic atoms, the terms of high multiplicity lying 
lowest. He has shown by a general argument that electrons of large total 
quantum number (which the 4$ electrons of iron have) have an exchange in¬ 
tegral of the opposite sign to that found in hydrogen, so that the order of the 
non-polar terms would be reversed. This we should fit into our scheme in the 
following way: although this exchange integral is anomalous at large R, it 
presumably changes sign and becomes normal at smaller R; for first, Heisen¬ 
berg’s general argument only applies at large R ; and second, our condition 
that the energy levels should approach those of Bloch at small R, with the 
terms of large spin lying high, seems quite general. Thus we should assume 
that the terms at small R lie as in Fig .1 • but that at a considerable value of 
R, there is a crossing over (in this case the 3 ZA N crossing and lying under the 
*2 S s ), described by a change in the sign of the exchange integral K used 
in the next section from negative to positive. By assuming the existence of 
an inner group of electrons with these properties, we seem to secure a con¬ 
sistent picture of ferromagnetism. On the other hand, of course it is always 
possible that ferromagnetism is connected with the fact that the valence 
electrons of iron have an orbital angular momentum different from zero. 

* F. Bloch, Zeits. (. Physik, 57, 545 (1929). 
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4. Cohesion 

We are now prepared to begin the actual calculation of the lowest sta¬ 
tionary states. We make several simplifications, which we later remove. 
First, we consider only non-polar states, in Heisenberg’s scheme, and disre¬ 
gard exchange integrals except between adjacent atoms; this is-the approxi¬ 
mation also made by Heisenberg. Finally, for the present, we consider a linear 
lattice, n atoms uniformly spaced along a line, rather than a space lattice. 
Our problem, of course, is to compute the matrix of the energy with respect 
to the wave functions we have chosen, and then solve the problem of making 
proper linear combinations. The computation of the matrix is simple. By a 
fundamental formula of the previous paper mentioned above, the diagonal 
components are a sum, first, of the energies of the separate atoms, which we 
need not consider; next, a sum over all adjacent pairs, as the pair of atoms 
a and b, of integrals J(a/b), which is essentially the diagonal energy Ei of 
Heitler and London; finally, a sum over all adjacent pairs which have the 
same spin, of terms— K(a/b), where K is the exchange integral Ei of Heitler 
and London. Further, it is easy to show that all non-diagonal terms are 
zero, except those for which the distributions in the two states differ only by 
the exchange of an adjacent a and /3; in such cases, the term is — K. In the 
normal case, to which we shall refer specifically, J and K as functions of R 
are both negative, ^ numerically greater than J. But in Heisenberg’s case, 
K must be taken to be positive for large R, although presumably negative 
for small R. 

To illustrate by H 2 , we have one state with both spins parallel; then the 
energy is J — K. Next we have the problems with one parallel, the other 
anti-parallel; there are two such states (the two polar ones being omitted). 
Each has the diagonal energy J, and the non-diagonal energy between them 
is —K. Thus the equations for the linear combinations are 

(/—W0S(1)—tfS(2) = 0 

-JCS(l) + (/-inS(2)=0, 

giving energy values W = J±K, the first evidently being the singlet, the 
second the component of the triplet. 

In the general case, the computation of the matrix is no more difficult; 
the real problem is the solution of the linear equations for the S’s. We cannot 
do this exactly; but we adopt two methods of approximation, one holding 
for larger spins, the other for smaller spins. We first discuss the former. 

Method for large spins. First we take the problem where all spins are 
parallel, n a = n, n fi = 0. Here there is but one state. Since with our linear 
lattice there are (n —1) adjacent pairs, and all spins are parallel, the energy 
is simply {n—\)J—{n — \)K. Since J and K are normally both negative, 
but K numerically greater than J, this is a positive energy for all values of R , 
and results in a repulsive term. For Heisenberg’s case, on the other hand, K 
is positive, and this term is attractive. Next we take the problem »« = »-1, 
np = l. There are now n unperturbed wave functions: the one electron /3 can 
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be attached to any of the n atoms. We number the functions by the number 
of the atom where the electron is, only decreased by -J : we have ii\/i ■ ■ ■ n n ~\n- 
Each of these functions will have the diagonal energy (« —1 )7 —(w —3)A, 
since two of tire adjacent pairs now have opposite spins, except for the two 
functions ii\/« and //„_ 1/2 where our /3 electron is at an end of the lattice, and 
the energy is (» — 1)7 — (n — 2)K. Also, all non-diagonal terms will be zero 
except those between terms of adjacent number, as for example between 
those symbolized by 


and 


•• • • a a a fi a a a • ■ ■ 
•■■aaaapaa--' , 


and which differ by just one interchange of an a and /3. As a result, the pertur 
bation equations will be 

[(« -1)/-(»- 2) A - I V ],(i) - A'^) = 0 

- + [(»■— 1)7 — (»- 3)A' - 1 V “ KS i~^) = 0 

- A5^y)+ [(« -1)7 - (n - 3) A - U’ ]*V^y) - = 0 


- y)+[ («- 1 )J - («- 2 ) a - ir = 0 

These equations are easily solved; they occur, for example, in the problem of 
a string weighted at equal intervals , 6 the S ’s being the displacements of the 
weights. To solve, we merely assume S(k) = c J^(ak). The first and last 
equations give boundary conditions. They become like the others if we 
introduce an •$( — 5 ) and S(n + £), the first equation becoming 

— KS( — £) + [(« —1)7 — (m -3)K-W]S(D- KSQ) = 0, 

and if we further set S( — 5 ) = S(.]) and S (>2 + 5 ) = S(n — 5). These are then the 
boundary conditions; and to satisfy them we must take 

S(k/p) =cos p-irk/n, where p~ 0,1, ••• , n— 1. 

Now we substitute this form in our difference equations; and we get for W 

— k(cos—( k — l)-|-cos 1)^+ [(«— 1)7 — (» — 3)A— H^(/>)] cos-— = 0, 

\ n 11 J n 


from which in each case 

pir 

\V(p) = («— 1)7 — (« - 3) A - 2 A cos — . 

n 


* See, for example, Rayleigh's “Theory of Sound. 
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We now have the transformation coefficients S(k/p) and the energy values 
W{p) for the rotation of axes to the /»th stationary state; the IT's are the 
exact energy levels. They are evidently distributed between the values 
^(0) = (n-l)J~(n-l) AT,and W(n - 1 ) = (» - 1 ) 7 - („ - 3 - 2 cos 7 r(w - 1 )/») 
K = (n-l)J-(n-5)K, almost, for large n. Obviously IT( 0 ) is the energy 
of the level of highest multiplicity, which we have found before. Thus the 
levels 1 • • • (n — 1 ) are those of next to highest multiplicity. 

Next we take the problem with two electrons of spin' 0 . There are 
n(n — 1)/2 such terms: each of the two indistinguishable /S’s can be on any 
one of the n atoms, so long as they are not on the same atom. Now it is con¬ 
venient to denote states by the two atoms, say i and j (each going from 2 to 
« — 2 ) on which electrons/? are. Our problem becomes analogous to that of a 
square membrane loaded at equally spaced points. The diagonal terms 
of the energy are all (n- 1)7- (n-S)K, unless one of then’s is at an end of 
the lattice, or unless the two /?’s are adjacent. There are four non-diagonal 
terms for transitions from each wave function: for i—*i± 1, or for j—>j± 1. 
A typical equation can be written 

-KS(i,j- 1) 

- KS(i -1 ,j) + [(«-1 )J- (n- 5)K- W]S(ij) -KS(i+l ,j ) 

-KS(i,j+l)=0. 

This we satisfy by a product of cosine functions, S(ij/pq) = cos (piri/n) cos 
(q-rrj/n). We easily find that these exactly satisfy the boundary conditions 
when i or j = \ or n — \. There remains the condition when i is nearly equal 
toj. If i=j± 1, the diagonal energy is (w —1)7— (ti — 3)K, since the two/3’s 
are together; on the other hand, since the /3’s cannot be on the same atom, the 
coefficients S(jj ) and S 1 , j + 1 ) vanish, so that only two transitions, 
rather than four, are possible. If now we define an S(jj ) and S(j+\,j+l), 
we can make the equations of the same form as the general one, if only 
S(jj) + S(j+\, J +1) = 2S(j-\- 1 , j). This furnishes our second boundary 
condition, which is evidently along the diagonal of our square “membrane.” 
Unfortunately we cannot satisfy this condition exactly with our cosine func¬ 
tions; closer investigation shows that one must have much more complicated 
functions, with hyperbolic cosines, to satisfy it exactly, and one cannot carry 
the method through for the general case. Approximately, however, we can 
easily take care of our condition. If the p and q are not too great, so that the 
“wave-length” of the waves in our membrane is large, we can replace our 
condition by a differential one: it states that the amplitude at a point next 
the diagonal is the mean of the two adjacent values on the diagonal, and this 
very nearly means that the normal derivative of the function, at right angles 
to the diagonal, is zero. This we can satisfy by making our function symme¬ 
trical about the diagonal, or using cos ( pwi/n ) cos (qirj/n)+cos ( qiri/n ) cos 
(pirj/n ). We may expect this to hold best for small p and q, not so well for 
large values. It is clearly not right; for example, it yields » 2 /2 functions, in¬ 
stead of the correct numlier n(ti — 1 )/ 2 . 
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Our function is an exact solution of the difference equations, if not of the 
boundary condition; and we find for the energy 

lf r (/ > .?) = (« — 1)/ — (» — 5)K — 2K[ cos-— 1 -cos—j, 

\ n n / 

where p, q go from 0 to n— 1, but each pair is counted only once. The term 
of highest multiplicity comes from p = q = 0 ; the (»— 1 ) terms of next highest 
value are those with either p or g = 0 , but the other not; the remaining terms 
are of multiplicity smaller by two. 

This result can now be generalized without trouble: if we have many 
/3’s, the energy levels are given by 

W = {n-\)J-(n-\-2n f )K-2K £ cos—, p, = 0 • • • »-1. 

.-i n 

The terms where one or more />,’s equal zero are those whose total spin is 
greater than (n„ — ttg)/2; those with all p’s different from zero are those whose 
total spin equals («„ — »,j)/ 2 . The latter value is evidently enormously 
greater than the other: every spin has enormously more terms than any 
higher spin. Thus the terms of a given component of spin along the axis, and 
those of the same total spin, are approximately the same. We can at once 
find the distribution in energy of the terms of a given spin. They evidently 
cluster about the value («— 1 )/ — («— 1 — 2tig)K ; they are distributed about 
this value like the displacements of a point simultaneously acted on by a sum 
of tig periodic vibrations of equal amplitudes but arbitrary phases. This gives, 
of course, approximately a Gauss distribution. The width of the distribution 
curve can be derived very easily: we compute the mean square deviation 
of the energy from its mean, (tr—Jp ) 2 = 4/l 2 [2 cos (Pitr/ti)] 2 , the average 
being taken when each p varies independently from 0 to n. We can take 
this variation to be continuous rather than discrete. Then the product terms 
in the square of the sum of cosines all average to zero, the square terms 
average to §, and the result is 2K 2 ttg. These results may be compared with 
those obtained by Heisenberg on the group theory, and which as Bloch has 
shown can also be found from the present method. In the notation of the 
present paper, putting the number of neighbors of each atom equal to 2 , and 
leaving out the terms in J, Heisenberg finds 

TF= — (« — 2 « 0 + 2n g y n) K 
(W-WY = 2K 2 n s (\-n s /n)(\ + 2n ) /n-2(? h /ny ). 

Our formulas agree with these exact ones to terms in tig but no further, as we 
expect from the fact that our approximations hold only for small tig. For 
small p's, as we have seen, our results should be good even for large tig; for 
the case of ferromagnetism, when on Heisenberg’s hypothesis the terms are 
reversed, these are the lowest terms, so that this result should be very useful 
here. In the normal case, however, the lowest terms are those of large p, 
and these are the ones we need for cohesion. About these lowest terms, we 
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can be fairly sure that they lie higher than the lowest ones we have found, 
or (n— \)J — (ft — 1 — 4riff)K, since the mean lies higher than the mean we 
found. For zero spin, for example, we can be fairly sure that the term lies 
above (« — l)J+nK. But this value need not be a very good approximation; 
we actually find, by the method of the next section, that the lowest term for 
zero spin is about (n — 1 )/+0.290 m/v. Fortunately even this has a positive 
coefficient for K, and is so an attractive rather than a repulsive term. 

Method for zero spin. For zero spin, ?i a = n$ = n/2, and there are «!/(«/2!) 2 
terms. We adopt quite a different method of classifying them. Before, most 
of the terms of a given n$ had nearly the same diagonal energy; but now the 
range of energy is large, from (n—\)J for the state with alternating a’s and 
jS’s so that there are no parallel spins, to (n — \)J — (n — 2)K for the state 
where all the a’s come at one end of the lattice, all the j3’s at the other. With 
this large range, we find it convenient to classify terms by their diagonal 
energies; and we find as we should expect, that for the lowest states of the 
perturbed system we must consider most the low unperturbed states. We do 
not need to take into account all states: we find that approximately (though 
by no means exactly) the terms can be divided into a number of non¬ 
combining sets, and we set up one such set in the following way. We com¬ 
mence with the lowest state, of energy (n —1)/, where a's and /3’s alternate. 
Next we consider the n — 1 states which combine with it, coining from inter¬ 
changing one pair, and each having the energy (n — \)J — 2K, except those 
from the two end pairs, with energy (n — \)J — K. We leave these two out, 
retaining for our set the n — 3 states which have the energy (n — \)J — 2K. 
Each of these has n— 3 states with which it combines, coming from inter¬ 
change of one of the n — 3 adjacent pairs with opposite spins. Of these n — 3, 
the two in which the new interchanged pair is next the one already inter¬ 
changed have the energy (n — \)J — 2K; the one in which the pair already 
interchanged is changed back has the energy (« — 1)/; the two where the 
end pair is interchanged have the energy (n — \)J — 3K; and the remaining 
n — 8 have the energy (n — \)J — AK. We retain for our set only these n —8 
terms of energy (n — \)J — 4K. So we proceed, asking which terms combine 
with those already set up, and retaining just those whose energy is —IK 
greater than for those with fewer interchanges. We find that a term of our set, 
with the energy (n—\)J — 2pK, has non-diagonal terms to p terms of the 
set of energy {n — \)J — 2(p — \)K and to («— 3 — 5p) terms of the set of 
energy (n — 1)/ — 2(p+ \)K. Evidently so long as p is small, the terms we 
leave out of the set and yet which combine with terms of the set are com¬ 
paratively few. It is only for the large p’s that we make serious error by 
leaving out these terms, and for large p the diagonal energy is high enough 
so that for the lowest states of the perturbed system these unperturbed 
states are unimportant. Thus we may reasonably believe that the low 
energy levels found by solving this restricted problem will be approximately 
some of the low levels of the actual problem. We can at least be sure of the 
following: by the variation principle, they can be no lower than the actual 
stationary states. 
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The other sets of non-combining terms which we can set up are easily 
described, and are of considerable physical interest. Instead of starting 
from the state with spins alternating, we start from a state where the spins 
alternate up to..a given point; there the sequence is interrupted, and alter¬ 
nation commences again, so to speak, in the opposite phase, as 

•••afiafifiafiafia''- . 

With a few such interruptions in the course of the crystal, the energy is very 
little above the really lowest state; yet a great many individual inter¬ 
changes would be required to pass to the lowest state. With such a state to 
start with, we proceed just as we did before, and construct a whole system 
of states; and the non-diagonal terms between this and the first system come 
only from high values of p, involving many interchanges, and can be neg¬ 
lected. Physically, at the interruption of phase, one essentially has a slight 
interruption of crystal structure. Our catalogue of all possible states of the 
metal includes not only that where it is one perfect crystal, but also where 
it is composed of many smaller crystals not perfectly joined together. Ob¬ 
viously each problem can be treated separately; physically it would take a 
very long time to change from one to the other. And obviously each problem 
will give us essentially the same set of energy levels. 

We now take our set of wave functions, and try to solve the perturbation 
problem between them. For each value of p, we have many wave functions; 
and we look for those particular solutions for which all these functions have 
the same coefficient S(p). Afterwards we shall show that we really find the 
lowest solutions this way. Then, remembering the number of transitions with 
non-diagonal term K from a given state, computed above, we have for a 
typical equation 

- KpS(p -l)+[(n-l)J-2pK-W]S(p)-K(n-3-5p)S(p+l) = Q. 

This set of difference equations for the S’s is somewhat similar to what we 
had before; it also corresponds to a weighted string. But now the properties, 
and hence the wave-length, change from point to point, and we seek the vari¬ 
ous overtones. The equation is a close analogue to Schrodinger’s equation, 
in many ways; the fact that it is a difference equation rather than a differen¬ 
tial one is quite immaterial. To solve, we assume S(p) = e^ adp , where a is to 
vary slowly with p. Then S(p) =e a S(p— 1), etc., so that we have 

-Kp+e°[(n-l)J-2pK-W]-e 2 °K[n-3-5p]=0, 


— [(» — 1)7 — 2pK— \V] ±([(w— \)J — 2pK— \V\ l — AK 2 p[n — 3~3p\y , ‘ l 

— 2K[n — 3 — 5p] 

The equation expresses e a as a function of p, for any particular W. Now we 
must remember that there are essentially boundary conditions; the S’s must 
remain finite for p = 0 and p = an extreme value. To tell how to apply this 
condition, we must investigate the solution we have found. 
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The ratio e a of successive coefficients is real or complex, according as 
[(w - 1)/- 2/'K - IT] 2 is greater than or less than 4 K 2 p [?i- 3 - 5/>]. Regarded 
as a function of p, the limiting cases, where the two are equal, come from 
(n — 1)/— W = 2pK ± 2K [p(n — 3 — 5p] 112 . The right hand side, plotted as a 
function of p, forms an ellipse; the straight line represented by the left side 
intersects the ellipse in two points, or in none, depending on the value of W. 
The region of W where it intersects can be found by computing the maximum 
and minimum ordinates of the ellipse; that is, the values of (n-l)J-W 
for which (d/dp)(2pK ±2K[p(n — 3.— 5p)] 1/2 ) =0. This gives/> = (n —3)/10 
[l + (l/6) l/2 ] = (w — 3) X (0.0592, 0.1408). At these two limits, substituting, 
W={n — \)J-\-(n — 3)Ky,( 0.290, —0.690). For values of IT between these 
limits, there is a range of p for which e a is complex, and the solution is 
oscillatory; outside this region, which is closed, the solution is in any case 
exponential. To satisfy our boundary conditions, now, we have a problem 
much like that with Schrodinger’s equation in one dimension; and boundary 
conditions can be satisfied only if there is an oscillatory region. As a result, 
the actual energy levels of the problem must lie between the limits given. 
Closer examination shows that a “quantum condition” can be applied, and 
that between these limits there are just the number of energy levels there 
should be. We now have the lowest level: it lies arbitrarily close to our lower 
limit, or is 

W = {n- 1)7+0.290(n-3)A r , 

as we stated in the last section. In this lowest state, we can show without 
trouble that the unperturbed wave functions with p near 0.0592(n — 3) are 
represented most strongly. Thus the value of p is really quite small; relatively 
few pairs are interchanged, and we are safely in the region where we can treat 
the different systems separately. 

We have solved our problem for the lowest state in which all terms of 
the same p have the same coefficient. We can now investigate the effect of 
removing this assumption, varying the coefficient of one function of a given 
p in one direction, varying the rest to keep the same total representation for 
functions of this p, and calculating the change in the energy. When we do 
this, we find the energy to be a minimum with respect to such variation; in 
fact, the changes of energy compensate each other to a higher order, showing 
that the problem is nearly degenerate with respect to these coefficients. 
Thus we may be rather confident that we have a good approximation to the 
lowest non-polar states. It is of course obvious that this method becomes 
worse as we go to higher states. 

It is instructive to ask what ordinary perturbation theory would give us 
for the lowest state. The lowest unperturbed state has the diagonal energy 
(n —1)7; this represents the ordinary first order perturbation calculation. 
Now we pass to the second order calculation. The lowest state is not de¬ 
generate, so that we can use the power series development method. The next 
term in the expansion is 2,(7/,•,//,•<)/ Ei~E,), summed over all excited states 
j. Now there are non-diagonal terms 7/i, only to the n-1 states with /> = !. 
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Thus the II) fs are all equal to —K, and the energy differences are all given 
by Ej = E\ — 2K. Thus we have as second approximation 

K 2 1 

W = {n—\)J +(«—1)—— = (»—1)/+—(«—l)iC. 

2 K 2 

This differs from our result in having the factor \ rather than 0.290; we 
have only the first term of a series development, but it is reassuring that 
agreement is as good as it is., For finding the order of magnitude, we could 
use this term alone; we shall find this simple method useful with the space 
lattice. 

Effect of polar states. One can make an estimate, by a method like that 
used here, of the effect of the polar states in depressing the non-polar ones, 
which alone we have considered so far. We can build up a series of states by 
starting with a given non-polar state; then removing one of the electrons to 
an adjacent atom, producing a positive and a negative ion; then removing a 
second; and so on. The series of states so found behave formally like those 
used above. If we solve the problem by the previous method, or by the 
second order perturbation method, we get a further depression of the lowest 
state, which again can be written as 

—.square of non-diagonal term 
( ”- 1)X S energy dJence - 

The non-diagonal term which comes in here is presumably of the same order 
of magnitude as before, although it is a somewhat different integral. But 
the diagonal energy difference is now essentially an ionization energy, which 
is of the order of several volts, rather than the fraction of a volt that K is. 
Thus the effect on the energy is a number of times smaller than what we 
found before, and we can neglect it. It is not worth while calculating more 
accurately, in this approximation; for with H 2 , it appears that on account of 
the lack of orthogonality of the wave functions, the actual depression of the 
energy is very much less than this rough method would indicate, although 
the effect on the wave function is about what we should expect. One can 
reasonably believe for this reason that the polar states in the crystal depress 
the energy only very little. But we recall that their effect on the wave func¬ 
tion is to introduce free electrons. By our rough method described above, 
we infer that the fraction of free electrons is of the order of 1 percent, for 
reasonable choice of the constants. This could easily be in error by a factor 
of 10 either way; but at least we see that a definite meaning can be attached 
to the number of free electrons, and that there is a definite procedure for 
calculating this number. 

Normalization and orthogonality. We have not considered the lack of 
orthogonality of the wave functions, resulting in factors like the 1/1+5 of 
Heitler and London. When one tries to do this, one immediately strikes a 
difficulty which appears insurmountable: the factor in the denominator, 
instead of being like 1+5, is like l+«5+ • • • , where « is the number of 
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atoms, so that the term nS is enormous compared with unity. On examina¬ 
tion of simple cases, it appears that the remaining terms, coming from other 
permutations than the simple interchange, are also important, in many cases 
the terms almost cancelling each other. Further, in every expression for 
energy, like the simple (/ + K)/{\ + 5), there are more terms in the numera¬ 
tor, also of great importance. But the simple cases give no suggestion of how 
to treat the general case. The key to this difficulty comes from Bloch’s 
method. For example, the term of maximum multiplicity has one function, 
which can be expressed either by Heisenberg’s or Bloch’s functions. 
But the difference is that Bloch’s functions are really orthogonal, unlike 
Heisenberg’s, so that we meet no such difficulty. Of course, the same terms 
occur, but now in the normalization of the individual functions. And the 
numerators, and denominators like l+«5+ • • • , appear as products of n 
factors, each of approximately a simple Heitler and London form; further, 
all but one or two of these factors of the denominator cancel against equal 
factors in the numerator, giving very simple results. Essentially the same 
method can be used with the other states; for this method is one for treating 
a determinant of Heisenberg’s wave functions, and converting it into a 
determinant of Bloch’s functions; and all of our wave functions are products 
of two such determinants. When we calculate in our case, it appears that the 
terms 5 will have small effect; we are roughly half way between the cases 1+5 
and 1 — 5, and the effects of 5 nearly average out. This method at the same 
time gives the proper way of considering more distant pairs, as well as adja¬ 
cent ones; these contribute the further terms in the numerators, as J±K 
+ • • • . We see from the next paragraph that these more distant pairs are 
really quite important. 

Method for space lattice. So far, we have spoken about a linear lattice of 
atoms, rather than a space distribution. We now extend this theory to a crys¬ 
tal ; but we shall not carry it through in the same detail. We consider only the 
problem of zero spin, and use our second order perturbation approximation. 
Let us take the body-centered cubic lattice, which the alkalies have. The 
lowest unperturbed state of this lattice can be set up much as with the linear 
one: we let the electrons at the corners of the cubes have the spin a, those at 
the centers the spin /?. Then each electron is surrounded by eight others of 
opposite spin, so that if we consider only adjacent pairs, the diagonal energy 
of this state is 4 tiJ, where there are n electrons, 4 n pairs. This lowest state 
now has non-diagonal terms, each equal to —K, to the 4 n states obtained by 
interchanging an adjacent pair. Each of these states has two misplaced spins, 
each surrounded by 7 spins of the same sign, so that the energy has a term 

— 14 K. Thus for our perturbation problem, we have a non-diagonal energy 

— K, an energy difference 14 K, and 4 n non-diagonal terms, so that the 
oerturbed energy is \nJ-\-{AnK i /\.^K)=AnJ-\-{2/7)nK, 

This formula is rather significant. We compare the energy with that of 
the lowest state of the diatomic molecule, n — 2, which is ( n/2)J-\-\nK. We 
observe that for the crystal the coulomb interaction, the term J , has a coeffi¬ 
cient eight times as great: each atom has eight neighbors instead of one, each 
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penetrating. On the other hand, the valence term K has a coefficient only 
2/7, instead of 1/2. The valence, so to speak, is spread out among all the 
neighbors, and weakened in the process. It is partly on this account that we 
can say that the coulomb interaction is the more important part of the co¬ 
hesive force, in metals. 

We have considered only those pairs with smallest separation, and they 
give a definite attraction. But in this lattice, there are not only the eight 
nearest atoms at distance /?; there are also six, in directions parallel to the 
edges of the cube, at distance of 1.155/?, and these have parallel spins, pro¬ 
ducing therefore a repulsion. In .the diagonal energy, each pair will then 
contribute an energy J — K, a positive amount, so that the diagonal energy 
of the lowest state if 4«/-F3«(/(1.155 /?) — J'C(1.155 /?)). The next higher 
diagonal energy also will differ from this not merely by —14 K(R), but also 
by an amount 12 AT(1.155 /?), because by interchange of two spins some of 
these repulsive terms are removed. Thus the lowest energy level, counting 
also these pairs, is 


4NJ(R) + 3n(J(l. 155 R) - A(1. 155/?))- 


4nA 2 (A) 


14A(/?)-12A( 1.155/?) 


This results, on computation, in a much weakened attraction. If we were 
to consider in succession the effects of pairs at greater and greater distance, 
we should come in succession to attracting atoms with antiparallel spin, 
and repulsive ones with parallel, so that the successive approximations to the 
energy would oscillate, falling first above, then below, the true value. 

Application to sodium. For the sodium crystal, approximate calculations 
have been carried out, to test these formulas. These were made by taking a 
simple analytical expression for the wave function of the valence electron 
of sodium, and computing the integrals J and K. The details of the calcula¬ 
tion will not be given here. The first thing that one notices is that, for Na, J 
is several times larger in proportion to K than in hydrogen. It is this fact, 
taken together with the increased coefficient of the J term, that results in the 
importance of the coulomb term. It is also significant in connection with 
the question, why do the alkalies, and metals in general, form metallic lat¬ 
tices, while hydrogen does not? We can see the essential answer from our 
energy formulas of the previous page. For substances where J is the impor¬ 
tant term, the coefficient of J will be greater, and the energy lower, for 
the crystal than for the same number of atoms in diatomic molecules, and the 
crystal will be the stable form. For hydrogen, on the other hand, the valence 
term K is the important one. Here the coefficient in the molecular form is 
greater; and even if the metallic form of such a substance were stable in 
the sense of having a minimum of energy for some definite size, as seems 
quite possible, still the energy in the molecular stale would be lower. The 
atoms in the crystal would tend to form pairs, resulting in a molecular lat¬ 
tice; the molecules would repel each other, and would be held together only 
by van der Waals forces, which have been neglected in this paper. This seems 
to be exactly what hydrogen does. 
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The numerical values for Na are approximately as follows. If we take 
only the adjacent pairs, the minimum comes at R = 4.9an approximately, 
rather seriously less than the correct value 7; this can partly he explained 
by the observation that the best atomic wave function for use in the crystal 
would be more extended than that determined from the free atoms,which are 
here used. The energy at this point comes out about —40 kg cal/gm mol, 
the coulomb term supplying about four fifths of this; the observed heat of 
vaporization is 26.4 kg cal, so that this gives, as we should expect, too large a 
value. If now we consider the repulsive pairs at distance of 1.155 R, the 
situation is quite changed. In the first place, the energy is reduced from 
— 40 to about — 9 kg cal. When we remember that these two values are the 
first two terms of a series, whose value oscillates on both sides of the answer, 
it seems very reasonable that the final result should be not far below the 
experimental value. The problem of properly computing this energy must 
be done by the method, using Bloch’s functions, described in the preceding 
section. In the next place, the minimum of the curve is greatly broadened: 
for quite a range of values, from R = 4.9 (the previous minimum) to R — 7, the 
energy stays about constant, the change.of the attractive term being just 
about balanced by the relatively more rapid change of the smaller repulsive 
effect. (For smaller R' s, a situation can be found when the denominator 
14 K(R)- 12 #(1.155 R)= 0, so that the function becomes infinite; but this 
is without physical significance.) No doubt a persistence of this effect in the 
final answer helps to correct the improperly low grating space we have al¬ 
ready found. It also is interesting in connection with the compressibility. 
The alkalies are remarkably compressible, and if we compute the compressi¬ 
bility for the case where only adjacent pairs are considered, the result is too 
small by a factor of 2 or 3. On the other hand, considering the next set of 
atoms, our very broad maximum would give much too great a compressi¬ 
bility. Here again it seems that our result may oscillate, perhaps approaching 
eventually something near the right value. 
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I. A Tribute to John Clarke Slater 

Professor Slater has been, and continues to be at age 65, one of the twentieth 
century’s great men of science. Even a partial documentation of his direct 
contributions to physical science is most impressive and quite awe-inspiring 
to anyone who envisions himself a participant in physical research. Indeed, 
as C. P. Snow has said of Lord Rutherford, “ He seemed ten percent larger 
than life.” Rather than a tabulation and analysis of accomplishments this 
appreciation offers some observations and a viewpoint on the style, environ¬ 
ment, and type of emphasis Slater has brought to electronic structure theory 
particularly in the period since World War II. 

A student confronting J. C. Slater in the late 1940’s or 1950’s found a 
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legendary and distinguished figure, full of honors and authority as head and 
essential creator of the Physics Department of the Massachusetts Institute of 
Technology. When first encountered he appeared to all, and remained for 
most, a rather terrifying person—one felt reluctant to ask questions or 
attempt scientific interchange. In his papers, and especially in his course 
lectures and research talks, one quickly found a very different man. Not only 
were the presentations perfectly organized and timed, but the words were 
clear, well phrased, and in basically simple language. All in all the transfer of 
ideas occurred as a highly communicative process. At all levels the courses 
and the method of description were completely unique and, although ranging 
over the whole structure of matter (save the nucleus), the material itself was 
almost entirely original. Perhaps half mathematics, the theory was not couched 
in the popular formalism of field theory or diagramatic schemes and one 
easily obtained a very direct physical picture of phenomena. The lectures had 
sweep and style—on the one hand a fundamental, unified, and general ap¬ 
proach was always maintained, on the other hand one distinctly felt the methods 
capable of practical implementation. It was forever the many-electron wave 
function, its properties and description, that mattered rather than specific 
applications to current topics in chemistry or solid-state physics. Although 
there were many personal occasions to remind oneself of Slater’s thorough 
knowledge of experimental physics, the courses and most of the research 
were oriented more toward methods for generating many-electron wave 
functions than to the analysis of specific experimental data. Development of 
theorems and approximation schemes having direct parentage in Schrodinger’s 
equation, reliance on physically well-defined theory, and instinctive mistrust 
of simple model theories have been cornerstones to Slater’s approach. In 
view of this orientation it is not at all surprising that many of Slater’s pene¬ 
trating insights have only manifest their efficacy with the advent of large-scale 
digital computers and only after elaborate calculations have been performed. 
It is revealing in this regard to find that Slater's texts alone, among all of 
those devoted to molecular quantum mechanics and quantum chemistry, 
emphasize and fully develop Hartree-Fock theory. Yet it is now clear that the 
molecular Hartree-Fock solution is the most fundamental and practically 
important starting point for all of chemical structure theory. 

Looking back a generation to the small collection of distinguished men 
who since the middle 1920’s have contributed so much to modern physical 
theory one is struck by the fact that Slater alone among this vanguard of 
great mathematical physicists continued to pursue electronic structure theory. 
Already in 1933 he had produced a comprehensive text (“Introduction to 
Theoretical Physics’’) containing chapters that brought one to the forefront 
of electronic structure theory. His characteristic style and much of the outline 
for the next 33 years of an enormously productive career are clearly apparent. 
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Now again during these last two years, 1965 and 1966, we find two more 
completely unique and original texts (“ Quantum Theory of Molecules and 
Solids,” Vols. I and II), again representing the most up-to-date and definitive 
works on ab initio electronic structure theory! It is well known that fads in 
physics change rapidly and Slater’s path of “ sticking with the problem ” has 
not been universally fashionable, either in the worlds of physics or chemistry. 
It would be untrue to represent it otherwise. In physics, of course, the main 
line of fundamental investigation since the 1930’s has been toward elementary 
particles while those in solid state have most often been satisfied with much 
more empirical and restricted theory, closely tied to specific experiments. 
Chemistry is overwhelmingly a problem of complexity, largely organized by 
qualitative, macroscopic rules—only now is the desire arising to understand 
phenomena on a more quantitative microscopic basis and only just now are we 
approaching adequate technical tools for handling the simplest molecules of 
genuine chemical interest. Thus we conclude with perhaps the most significant 
aspect of Slater’s career: on the eve of institutional retirement, when the 
impact of even great men wanes, we find that physical science is beginning to 
realize a need for full and fundamental understanding of complex electronic 
structure theory and is now “ catching up ” with Slater in his approach to this 
problem. There is no question that Slater’s influence on the course of science 
will be even greater during the next thirty-five years than it has been in the 
past thirty-five. 


II. Nature of the Problem 

The principal reason why intermediate density many-electron theory 
deserves to be classed as one of the most important problems in contemporary 
science is because the overwhelming majority of natural phenomena find their 
origin in the detailed pattern of electron motion. Diversity, complexity, and 
vast variety are the characteristic features of this problem, in sharp contrast to 
mathematical abstraction and ultimate simplicity, the predominant elements 
in problems handled successfully by the methods of theoretical physics. A key 
concept in organizing ideas is the relative information content of a set of data 
obtained from experiment by instrumental measurements or generated theo¬ 
retically. In a general way the information-content concept best illuminates 
the various aspects of current electronic structure research. 

A. Chemistry from small polyatmomic molecules 

Quantitatively meaningful ab initio polyatomic solutions have been ob¬ 
tained during the last two years for molecules with one, two, or three atoms 
of those from He to Ne and up to eight attached hydrogens. (The next section 
of this paper reviews a good part of this work.) Relative to the 103 different 
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atoms of the periodic table and to the size of most chemically important 
molecules these polyatomic species at first appear to be insignificantly closer 
to chemical reality than the diatomics for which well-established sets of 
wave functions now exist. However, chemistry really starts at three-atom 
systems and the amount of fundamental chemical information brought forth 
from wave functions for even simple polyatomic molecules is truly surprising 
—particularly if the solutions are carried out for numerous nonequilibrium 
geometries (/). Examples from the Princeton Laboratory are: the origin of 
rotational barriers; the nature of the hydrogen bond; the properties and 
characteristics of electron-deficient species (e.g., BH 3 vs B 2 H 6 and B 2 H 6 vs 
C 2 H 6 ); the origin and generality of Walsh’s rules (inorganic stereochemistry); 
characterization and comparison of boron-nitrogen bonding with carbon- 
carbon bonding; the basis for bond formation with noble gas atoms and 
predictions for possible new compounds; proton affinities of ammonia, 
water, and methane, and criteria for certain structural changes in water and 
ice; stability and shape of important organic radicals and reaction inter¬ 
mediates (e.g., CH^, methylene, the methyl carbanion, and cyclopropenyl 
cation, C 3 H 3 ). These latter four species are among a small number of rela¬ 
tively simple radicals and ions which are central to a large fraction of organic 
reactions but are generally inaccessible to direct instrumental measurement. 
Another area in which theory has a unique role occurs where a repulsive 
potential surface is likely or where synthetic pathways are very indirect 
(e.g., HeF 2 , NeF 2 , HeO, NH 4 , H 3 0, BH 3 ). 

B. Limitation on Ab Initio solutions 

In spite of two hundred years of instrumental measurements, the chemical 
structure symbols and reaction equations which have been derived from this 
experience do not yield near enough information to tell us all that we desire 
to know about properties and detailed mechanisms. On the other hand, it is 
equally clear that numerical experiments based on ab initio solutions to 
Schrodinger equation contain far too much information. 

The size of molecule for which an ab initio wave function may be obtained 
is limited simply by the time required to generate and manipulate three and 
four center two-electron integrals. In some of the calculations reported in the 
next section there are 50,000 of these and each is computed to nine significant 
figures—certainly representing many times more bits of information than 
required to delineate all chemical aspects of the problem no matter how 
indirectly the desired chemical numbers are related to the many-center inte¬ 
grals. Our approach to this information problem is twofold: First, we are 
plotting integral distribution functions for the various types of integrals 
involved under various geometries and at various levels of approximation. 
Second, we are developing mathematical inequalities relating changes in 
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sums of one-electron molecular orbital energies to changes in total energies. 
Already this investigation has revealed that there is an inherently low informa¬ 
tion content in the specification of bond angles. One of the most important 
future efforts will be attempts to relate additional chemical properties to 
well-defined subunits of ab initio wave functions. 

C. Connection between Ab Initio solutions and simple model theories 

Unquestionably the major issue facing electronic structure theory is that of 
making continuous connection between ab initio solutions and the often 
successful, largely ad hoc, model theories of chemistry and solid-state physics. 
As more quantitative results for low-symmetry electronic systems are demand¬ 
ed it is becoming apparent that implicit to this task is the fact that current 
model theories are inadequate—at least in the sense that we do not know what 
circumstances are required for the models to yield reliable predictions. 
Extended Hiickel theory and other models, like chemical structural formulas, 
do not generate enough information to convincingly differentiate many 
chemical phenomena. On the other hand, it is also true that to make electronic 
structure theory an everyday tool of chemical research quite simplified theories 
are going to be required. 

It appears likely that the most promising resource for developing simplified 
theories will be ab initio numerical experiments on polyatomic molecules 
rather than the traditional recourse to data from instrumental experiments. 
Similarly, the appropriate mathematical methods will be more akin to techni¬ 
ques in statistical communication theory than they will to the transformation 
theory and diagramatic techniques derived from the field of mathematical 
analysis. It is perhaps not surprising that a longer period has elapsed between 
the enunciation of Schrodinger’s equation and the present day than between 
the discovery of the electron and the advent of wave mechanics. 


III. Recent Advances in Polyatomic Electronic Structure Theory 

A. Basis sets and many-center molecular integrals 

The two principal impediments to widespread realization of ab initio 
polyatomic molecular wave functions have been. First, the intricate, numeri¬ 
cally complex and time-consuming effort required to generate the very large 
number (500-50,000) of two-electron, three- and four-center, six-dimensional 
electrostatic interaction integrals. Second, the choice of analytical form for 
the basis set and the multidimensional nonlinear parameter search identified 
with an adequate representation of the charge distribution. These two problems 
are interdependent. Among the several attractive schemes for evaluating the 
many-center molecular integrals the currently most successful methods are 
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all based on the special properties of Gaussian functions. Among the Gaussian 
based methods the simplest and fastest is direct use of lobe functions, largely 
because of the elimination of spherical harmonics and because only the sim¬ 
plest Gaussian forms are employed (2). With respect to the orbital basis, the 
key decision is choice of Hartree-Fock atomic orbitals, because these lead to 
molecular wave functions close to a molecular Hartree-Fock solution thereby 
largely eliminating the parameter search problem, and because they lead to 
direct chemical interpretability of the wave functions (3,4). It is not obvious at 
first that Gaussian lobe functions can efficiently span an atomic Hartree-Fock 
solution, particularly since one must be concerned with simultaneous matching 
of both the radial and angular dependence of the Hartree-Fock orbitals. 
Nevertheless, after considerable experimentation, it has turned out that 
sufficient accuracy can be realized with relatively few Gaussian lobe functions 
(15 or 16 for atoms such as C, N, O, and F) (5). The accuracy of this 
representation has been tested not only in terms of total energy but also by 
computing expectation values such as the quadrupole moment and quadrupole 
coupling constant for individual orbitals. Essentially double-zeta function 
quality is found for all tests. 

B. Many-electron formulation 

During the last few years there has been a resurgence of interest, by both 
physicists and chemists, in formal many-particle theory, particularily the 
question of how electron correlation may best be represented in the wave 
function. One promising approach, currently in active development, is the 
natural spin-orbital expansion. Another scheme, the MO-IS method, is des¬ 
cribed in the next part of the paper. Many sophisticated techniques have been 
reported, numerous excellent reviews of this subject now exist, and any further 
detailed analysis is inappropriate to our purposes. All of the work reported 
here has been carried out by the well-known SCF MO, and VB methods. 
Wave functions at this level of approximation (particularly the Hartree-Fock 
MO) are generally treated as a zeroth order starting solution in contemporary 
many-particle theory. One of the most important new results that is coming 
out of our laboratory at Princeton and from several other centers is the full 
realization that this zeroth order starting point is very good in itself, encompas¬ 
sing by far the largest share of all chemistry and solid-state physics. One 
possible corollary is that we may obtain sufficient chemical, physical, and 
numerical information from these well-established schemes that it will become 
unnecessary to push approximations appreciably further. 

The digital computer program for our Roothaan MO-SCF scheme is 
similar to those which have been written at MIT, Chicago, and elsewhere. 
At the time of writing, this program is limited to closed-shell cases but will 
be able to handle odd-electron ground-state molecules very soon and excited 
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states somewhat later. A general capability for carrying out valence bond 
calculations has not existed heretofore. The method we have employed is 
formulated in terms of a nonorthogonal basis via Lowdin’s overlap determin¬ 
ant prescription. Because reasonably accurate VB wave functions require at 
least several configurations, our digital computer program for this method is 
more limited as to size of system than the MO, but it is capable of treating odd- 
electron species. 

1. Molecular Orbital Results. The principal objective of the work reported here 
is to give examples of ab initio Polyatomic molecules—but to be reliable and 
quantitative in the representation of chemical properties, the wave functions 
must be close to true molecular Hartree-Fock solutions. This is clearly indi¬ 
cated by results using a single Gaussian per orbital (6,7). We have found 
similar results employing a single Gaussian per lobe. It is quite easy to produce 
an ab initio solution with single gaussians for a sensationally large molecule 
but, for the most part, these wave functions are even qualitatively worthless. 
When compared with molecular Hartree-Fock solutions it is also apparent 
that less drastic, but nevertheless chemically significant, errors are introduced 
through use of a single exponential function per orbital (8). Fortunately, we 
have available a fundamental set of reference wave functions from which we 
can make controlled and continuous excursions to slightly less accurate solu¬ 
tions always keeping track of our source of errors in chemically important 
expectation values. It is essential to explore how far one can go from the 
reference solutions and still produce quantitatively significant results because 
small simplifications can make a large difference in the size of system one is 
able to treat. These reference wave functions are the true molecular Hartree- 
Fock solutions obtained for diatomic molecules at the University of Chicago 
under the direction of C. C. J. Roothaan (P). Table 1 gives total energy com¬ 
parisons for six representative diatomic molecules. Our solutions employ only 
s and p basis orbitals while the University of Chicago solutions have small d 
and f contributions in C 2 , N 2 , and F 2 . In all six molecules the effect of a co¬ 
ordinate scale factor was explored, and it is only for hydrogen that an appreci¬ 
able effect is observed although B in the BH wave function also contracts 
somewhat. The last three wave functions allow the linear expansion coefficient 
for the outer group of Gaussian lobe functions to be energy determined. This 
permits the tails of the 2s and 2p to move in or out relative to a fixed Hartree- 
Fock AO. Freedom of this sort has been introduced into about half the calcu¬ 
lations reported here and, aside from the special contraction of hydrogen, 
appears to be the primary modification of the strict LCAO worth considering. 
For example, this modification can frequently change prediction of inter- 
nuclear separations from 10% too large to ±2% error, but does not change 
most conclusions in comparing one molecule with another. It has been 
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TABLE 1 


Comparison with Molecular Hartree-Fock Wave Functions" 



Atoms 

Molecule 

Difference in total 
molecular energies, 
eV 

Species 

Princeton 6 

Chicago 

Princeton 6 

Chicago 

LiH 

Li, 7.43119 

H, 0.5000 

7.43273 

0.5000 

7.972852 

7.98687 

0.38 

(0.17%) 

BH 

24.52409 

24.52905 

25.10007 

25.13136 

0.85 

(0.12%) 

Li 2 

7.43119 

7.43273 

14.86501 

14.87152 

0.18 

(0.044%) 

c 2 

37.68052 

37.68861 

75.35003 

75.40620 

1.52 

(0.74%) 

n 2 

54.38815 

54.4004 

108.91896 

108.9922 

2.00 

(0.067%) 

f 2 

99.38232 

99.40928 

198.69293 

198.76825 

2.04 

(0.38%) 


a Atomic and molecular total energies are given in Hartree units. All energies are negative 
and all calculations are for the observed internuclear separation. 

b R. J. Buenker, J. L. Whitten, and L. C. Allen (submitted to J. Chem. Phys.). 


observed by other workers that the hydrogen atom, almost unique among 
atoms, contracts appreciably when it enters into chemical combination. We 
now have widespread confirmation of this effect. Relative to the exact free 
atom orbital, e~ r , an in situ contraction fromc -1 • 2r to e~ l ' 5r is to be expected: 
the specific value can be related to the electronegativity of the atom attached 
to the hydrogen. It is to be noted that these two modifications in the rigid LC 
(Hartree-Fock) AO basis set are very much in the spirit of our approach because 
they are simple well-defined perturbations which do not require an elaborate 
nonlinear parameter search, and retain the simple chemical interpretability 
of the results. 

2. Valence Bond Results. The most noticeable feature of the valence bond 
results is that a moderate number of configurations yields energies a bit lower 
than molecular Hartree-Fock solutions and, of course, greatly improves the 
shape of potential curves at larger internuclear separations. Figure 1 shows 
F 2 using a rigid LC(Hartree-Fock)AO basis and eight configurations. Four 
configurations correspond to FF and four to F + F", representing all possible 
occupancy combinations for the 2s and 2p orbitals keeping the Is orbitals 
always doubly occupied. The total energy at the predicted equilibrium separa¬ 
tion (12±% too large) is -198.7780 Hartree units and the predicted binding 
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Fig. 1. Potential curve for F 2 constructed from a valence bond wave function using 
an LC (Hartree-Fock) AO basis. 


energy is .35 eV (experimental = 1.37 eV). The valence bond solution was 
obtained for F 2 because it alone among the simple diatomics yields a negative 
binding energy (-1.37 eV) for the molecular Hartree-Fock wave function. 
Das and Wahl (10) have recently carried out an optimized orbital MO-con- 
figuration interaction treatment and obtained results comparable to ours: 
E t = -198.8378 Hartree units at an i? equi , 9% too large. BE = 0.54eV 
(experimental =1.37). 

In Table 2 we give a series of comparative results for the well-known 
reference molecule hydrogen fluoride. We are just beginning to compute 
properties for our wave functions and some of those for the valence bond HF 
solution are compared with the molecular Hartree-Fock wave function in 
Table 3. The fluorine function has not been completely optimized in its 
molecular environment, and our experience indicates that for valence bond 
solutions this accounts for the relatively poor dipole moment result. For 
the other properties there is no strong reason to believe that values obtained 
from one function are better than those from the other. For LC (Hartree-Fock) 
AO solutions several of the polyatomic molecules (e.g., H 2 0) give dipole 
moments 30% too large. Although allowing tails to vary and hydrogens to 
contract improves results the addition of d functions is almost certain to be 
required in some cases. 
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TABLE 2 

Wave Functions for Hydrogen Fluoride 


Method 

Basis set 

Total 

Energy 

(Hartree 

units) 

Dipole 

moment 

(Debyes) 

Energy 
comparison 
with molecular 
Hartree-Fock 
eV 

SCF MO 

Single experimental AO* 

99.53614 

1.44 

14.5 

SCF MO 

Hartree-Fock AO* 

- 99.96339 

1.99 

(0.53%) 

2.9 

SCF MO 

One centre c 

- 100.00529 

2.10 

(0.11%) 

1.8 

SCF MO 

Gaussian, nuclear center d 

- 100.01785 

2.35 

(0.07%) 

1.4 

SCF MO 

Gaussian, lobe* 

- 100.02238 

1.98 

(0.05%) 

1.3 

SCF MO 

Molecular Hartree-Fock / 

- 100.07030 

1.945 

(0.04%) 

SCF MO Cl 

Hartree-Fock AO* 

- 99.98352 

1.835 

2.36 

Valence bond 

Gaussian, lobe^ 

- 100.10434 

2.00 

(0.09)% 

0.925 

Experimental 


- 100.4393 

1.82 

lower 


* B. J. Ransil, Rev. Mod. Phys. 32, 245 (1960). 

* A. M. Karo and L. C. Allen, J. Chem. Phys. 31, 968 (1959). 
c R. Moccia, J. Chem. Phys. 40, 2164 (1964). 

d M. C. Harrison, SSMTG, Quart. Prog. Rep. No. 49. July 15, 1963, MIT. 
e J. L. Whitten and L. C. Allen (to be submitted to J. Chem. Phys.). 
f P. E. Cade and W. Huo (to be submitted to J. Chem. Phys.). 

8 R. M. Erdahl, J, F. Harrison, and L. C. Allen (to be submitted to J. Chem. Phys.). 

TABLE 3 

Predicted Expectation Values for Hydrogen Fluoride" 


Quadrupole 


Function 

Total energy 

Dipole* 

moment 

Moment 
relative 
to F 

<l/r F > 

<1//h> 

Field Field 

Gradient gradient 
at F at H 

Valence bond 
(Princeton) 
Molecular 

- 100.10434 

0.790 

1.777 

2.7743 

0.6241 

3.3001 

0.4922 

Hartree-Fock 

(Chicago) 

- 100.07030 

0.765 

1.884 

2.7169 

0.6112 

2.8687 

0.5398 


" All values in atomic units. Valence bond results from the Ph.D. thesis of J. F. Harrison 
(to be submitted to J. Chem. Phys.). 

b Experimental value =0.716 atomic units = 1.827 Debye. 
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C. Correlation energy correction 

Because molecular Hartree-Fock solutions are in general quite successful 
in predicting expectation values for one-electron operators at the equilibrium 
configuration, it is worthwhile to seek an empirical correlation correction for 
binding energies—the principal shortcoming of the Hartree-Fock solution. 
This correction will be a single number rather than a complex functional 
dependence in a wave function. If possible, it would also be most useful if this 
molecular number could be obtained in terms of the atomic (or ionic) con¬ 
stituents. The basic idea for such a correlation comes from the long-standing 
realization that the standard Hartree-Fock approximation is a pair-preserving 
theory, thus separating into ionic states at large internuclear distances. It has 
also been known from the early days of molecular quantum mechanics that 
the major part of the correlation energy occurs for electrons of opposite 
spin occupying the same orbital. It should then be true to first order that one 
could obtain an i? equi , binding energy correction by taking the difference in 
the correlation energy between Hartree-Fock solutions for neutral free atoms 
and the ions into which the molecule separates and add this to the binding 
energy computed from a molecular Hartree-Fock solution. Nesbet (//) 
appears to be the first to have specifically stated this prescription and applied 
it to a molecule. Clementi (12) and others have also made use of it. 

We (13) have systematically employed this scheme for all of the University 
of Chicago diatomic Hartree-Fock reference wave functions. The results are 



Fig. 2. Correlation corrected dissociation energies compared to experimental values 
for the hydrides Li to F. 
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Fig. 3. Correlation corrected dissociation energies compared to experimental values 
for the hydrides Na to Cl. 

very encouraging: Dissociation energies are predicted to within +£ eV— 
essentially as accurate as the experimental values themselves. Figures 2 and 
3 illustrate the results for second and third row hydrides. Similar graphs have 
been obtained for homonuclear species, 12-electron, and 14-electron hetero- 
nuclear diatomics. We have also gone beyond this simple method in two ways. 
First, we have used the diatomic derived results as a per-bond basis and simply 
added the correlation energy corrections for each bond for a number of our 
polyatomic wave functions. In most polyatomic molecules, however, we do 
not possess a precise Hartree-Fock solution, and thus we must first correct 
to the Hartree-Fock level and then apply the correlation energy correction 
on top of this. For example, for 10-electron polyhydrides we have calculated a 
wave function close to the molecular Hartree-Fock solution and at the same 
accuracy level for each molecule. Assuming a constant isolectronic discrepancy 
the difference was calibrated to HF and added to the correlation correction. 
Figure 4 shows typical results—again, an order of error not different from 
experimental values. Second, we have devised a set of rules for the correlation 
energy in isolated ions and these yield smoother curves with slightly greater 
average accuracy over a range of molecules and they also provide a basis for 
extrapolation beyond existing computations. 

There is one corollary to this correlation correction scheme which is worth 
noting because of the accuracy we have achieved: The atomic (ionic) based 
nature of the corrections implies that the correlation energy is independent 
of bond angle. Our present results are too crude to use this fact in seeking the 
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Fig. 4. Adjustment to the molecular Hartree-Fock level plus correlation correction for 
some ten-electron polyhydrides. 

origin of rotational barriers, but it does indicate that molecular-shape 
prediction is within the capability of approximate molecular Hartree-Fock 
solutions. 

D. Examples from chemistry and physics 

1. The Chemical Bond at Large Internuclear Separation—The Hydrogen 
Bond (14). The perturbed-atom nature of the valence bond formulation and its 
correct free-atom separation at large internuclear distances make this the 
proper choice of representation for interactions such as the hydrogen bond. 
The particular system that we have investigated is the strongest known hydro¬ 
gen bond, the bifluoride ion, [FHF] - . In this and all of the other examples 
to chemistry and physics there is an enormous wealth of details and data 
available—a great deal more in fact than one is used to from experience with 
diatomic solutions. This large quantity of important data can only be ade¬ 
quately dealt with in the full-length journal articles currently being prepared. 
Here we can only give a broad survey, primarily relying on selected graphs, 
which is bound to appear superficial to the deeply interested reader. Underlying 
all of our work is a constant search for answers to a number of fundamental 
technical questions such as the following*. Have we explored variation in the 
quality of the basis set sufficiently so that we are not missing fundamental 
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chemical effects ? Have we carried out enough check calculations and repeated 
enough existing work to know that there are no errors in our elaborate 
digital computer routines? For example in the valence bond program we have 
reproduced the often checked HF calculations of Kastler and the LiH wave 
function of Karo. For [FHF]~ an elaborate set of calculations were 
carried out to obtain a fluorine solution that would simultaneously optimize 
the HF, F, and F _ systems. The H atom was carefully scaled in situ. 

Our wave function for the bifluoride ion embodies a complete valence 
bond configuration interaction, including all neutral and ionic states except 
F Is excitations. Figure 5 shows symmetric stretch of the F-F distance and 



Fig. 5. Energy variation with linear, symmetric F-F displacement in the bifluoride ion. 

demonstrates that our calculation predicts nearly the correct experimental 
distance of 4.25 au. Figure 6, asymmetric linear stretch at the equilibrium 
F-F distance, shows two stages in the development of an improved wave 
function. The reasonable approximation to the experimental force constant 
obtained from our wave function can be inferred from the figure. Figure 7 dis¬ 
plays the asymmetric stretch for several internuclear F-F distances, and for 
the same two levels of approximate solutions illustrated in Fig. 6. A funda¬ 
mental objective of our study was to demonstrate the single minimum nature 
of the potential curve around the F-F equilibrium separation which has long 
been postulated for this ion. Also of considerable interest for understanding 
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Fig. 6. Potential energy surface for motion of hydrogen in [FHF] at equilibrium F-F 
positions. 

hydrogen bonding in general is the point at which a double minimum potential 
first appears. We see that this occurs at an F-F separation of approximately 
5 au—a result not available from instrumental measurement. Figure 8 
gives further details of the complex nature of the potential surface for this 
system. 

Another detailed study of this sort, whose existence we mention here, is the 
interaction between two HF molecules. These calculations were undertaken to 
understand dimerization and hydrogen bonding as a function of H-F distances 
and as a function of angle and distance between the two molecules. A novel 
feature of this investigation is separate treatment of the problem with and 
without ionic state mixing between the two molecules, thus giving an estimate 
of the magnitude of van der Waals forces compared to the chemical bond 
forces. 

2. The Chemical Bond at Large Internuclear Separation—Noble Gas Com¬ 
pounds (15). Here again a valence bond wave function is appropriate for two 
reasons: First, we are particularly interested to know whether certain species 
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X H (atomic units) 


Fig. 7. Potential energy surface for motion of hydrogen in [FHF] along F-F axis. 


are bound or unbound and a Hartree-Fock solution can definitely lead to am¬ 
biguities because of its correlation and ionic separation errors. Second, noble 
gas atoms have all orbitals doubly occupied to a first approximation and high- 
lying first excited states, thus leading to repulsive curves for the zero 
order VB solution or separating into ions for the MO solution rather than 
the zero activation energy neutral atom state observed experimentally. In 
order to make a continuous connection between the free noble gas atoms and 
the possibility of molecule formation the best representation is a VB configura¬ 
tion interaction treatment. VB wave functions have been constructed for a 
number of possible noble gas compounds. Intensive efforts have been made 
and are continuing for the experimental synthesis of fluorides of the lower rare 
gas atoms. Also numerous qualitative and semi-quantitative predictions for the 
existence of He and Ne containing molecules have been made (16-18). Thus it 
has been especially important to obtain rigorous ab initio wave functions for a 
number of these species. For HeF 2 chemical structures included were: F He 
F, F" He + F + F He + F", F" He F + + F + He F", and F" He 2+ F". In 
general, each of these structures corresponds to many states differing in orbital 
occupancy, and the states in turn are composed of sums of twenty-row deter- 
minantal functions with symmetry-determined coefficients. Symmetric 
arrangements of atoms on a line lead to 11 states and 33 determinants, 
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Fig. 8. Potential energy surface for motion of hydrogen in [FHF] perpendicular 
and parallel to the F-F axis. 

asymmetric linear arrays to 18 states and 33 determinants, and offline 
equal bond length arrangements to 18 states with 53 determinants. These 
states represent a complete configuration interaction calculation with a 
ground-state atomic orbital basis (except for excitations of the fluorine Is 
electrons). 

Molecular potential energy curves were obtained for three geometrical 
types: linear symmetric, linear asymmetric, and bent configurations with the 
He atom midway between the fluorine atoms. Representative curves are 
shown in Figs. 9-11. The chemically most-significant states, with their approxi¬ 
mate weights, for the linear symmetric wave function at a separation near 
that expected if the molecule were stable (F—F distance = 4.25 au — 2.25 A) 
are given below along with isoelectronic [FHF]~ at its equilibrium separation 
(also 4.25 au): 

FHeF 


W S + 0.621 {F He F} + 0.432{FHe + F } 

+ 0.070 (F He F}+0.150{F + He F - } 
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[FHF] ~ 

Ts +0.118 {FH" F} + 0.464{F - H + F - } 

+ 0.471{F H F"} - 0.142{F Ft F - }. 



Fig. 9. Potential energy surface for linear symmetric HeF 2 . 


One of the strongest reasons for belief in the HeF 2 repulsive potential 
energy surface is our complete potential surface for [FHF] - which agrees 
closely with experiment. The most important terms in the equilibrium 
position [FHF] - wave function are displayed directly below the HeF 2 
valence bond expansion to aid qualitative understanding of the difference 
between these two systems. It is basically the ability to form an ordinary 
electron-pair bond between singly occupied orbitals on adjacent atoms that is 
required for binding. 

In addition to the ionic states discussed above we have also carried out 
calculations including further configuration interaction for HeF 2 : 

(a) In-out or split orbital, (ls)(ls'), flexibility was introduced into the He 
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Fig. 10. Potential energy surfaces for linear asymmetric HeF 2 . 


orbital. Results for a nonoptimized version of this are shown by the lower 
curve in Fig. 9. The total atomic energy was lowered 0.43 eV and the total 
molecular energy 0.72 eV by this process. 

(b) A He 2 po function was introduced. For linear HeF 2 at R = 4.25, 
/• = R/2, this lowers the total molecular energy by a very small value, 0.125 
eV. The effect on the energy of adding these two types of terms to our basic 
valence bond wave function was shown to be insignificant, and they certainly 
produce no new qualitative insight into the repulsive forces. However, the 
great variety of speculations as to possible binding mechanisms made it 
imperative that these effects be quantitatively evaluated. 

A complete configuration interaction over the occupied atomic orbitals 
was also carried out for HeO, HeF, NeO, NeF, and NeF 2 . Thus the_NeO 
wave function includes the chemical structures NeO, Ne + 0 , Ne 2 + 0 2 , and 
is composed of 16 states made up from twenty-four 18 x 18 determinants. 
As shown in Fig. 12, the potential energy curves for all species are repulsive 
(for NeF 2 and HeF 2 the linear symmetric molecule is plotted). The curves 
are all singlets with the oxygen atoms going toa 1 D configuration at infinite 
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Fig. 11 . Potential energy versus angle for HeF 2 (constant, equal bond lengths). 

separation. Triplet states for the oxides also were calculated. For these the free 
oxygen atom is in its ground state 3 P configuration (calculated to be 0.0805 
atomic units lower than the i D), but the molecular potential energy curves lie 
even higher than the corresponding fluorides. Table 4 displays the chemically 


TABLE 4 

Wave Functions at 2.0 Atomic Units (1.06 A) a 


HeO T s + 0.714HeO(2s) 2 (2p„) 4 - 0.072HeO(2s)(2p„) 4 (2p ff ) - 0.071HeO(2p) 6 

- 0.051HeO(2s) 2 (2 P<J ) 2 (2p„) 2 + 0.377He + O“(2s) 2 (2p o )(2p n ) 4 

NeO Ts + 0.600NeO(2s) 2 (2p„) 4 — 0.157NeO(2s)(2p <J )(2p„) 4 — 0.037NeO(2p) 6 

- 0.031 NeO(2s) 2 (2 P<T ) 2 (2p„) 2 - 0.364Ne + (2s) 2 (2 Pa )(2 Plt ) 4 0-(2s) 2 (2 P<T )(2p„) 4 

- 0.107Ne + (2s)(2p) 6 0“(2s) 2 (2p„)(2p,,) 4 + 0.043Ne + (2s) 2 (2p ff )(2p n ) 4 O-(2s)(2p) 6 
HeF T ~ + 0.848HeF(2s) 2 (2p„)(2p„) 4 

+ 0.282He + F- 

NeF + 0.770NeF(2s) 2 (2p a )(2p„) 4 

- 0.360Ne + (2s) 2 (2p o )(2p K ) 4 F- +0.104Ne + (2s)(2p) 6 F~ 


Contained in the Ph.D. thesis of A. M. Lesk. 
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Fig. 12. Potential energy curves for HeO, HeF, HeF 2 NeO, NeF, and NeF 2 . 

significant terms in the wave functions for the various species. The fact that 
the neon-associated species lie above those with helium can be attributed 
simply to the relative size of these atoms, although it is difficult to assign an 
effective radius to helium or bond length for HeO or HeF (the leading term 
in the wave functions effectively represents two neutral atoms repelling one 
another, and its coefficient measures the extent to which they have achieved 
a free atom-like behavior). As would be expected, the triatomic bifluorides lie 
below the diatomic fluorides because of their extra symmetry element. The 
particularly interesting result showing the oxides to lie lower in energy than 
the fluorides arises from three factors which together override the greater 
electronegativity of fluorine. In the neutral states 2s-2p hybridization, favored 
in oxygen over fluorine because of the smaller orbital energy separation, 
together with the existence of half-filled n orbitals in the oxygen containing 
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species permits an energy lowering charge redistribution. Hybridization in 
these states allows charge to move away from the between atom repulsive 
region. Among the singly ionized states there are three in NeO and one in 
HeO which significantly lower the energy through formation of electron-pair 
bonds between open shells on each atom. Hybridization in the ionic states 
has the opposite sense to that in the neutral states, favoring the pair bonds by 
building up the charge between atoms. Purely ionic contributions are almost 
identical for both the fluorides and oxides and the doubly ionized states, 
possible for the oxides but not the fluorides, make such an insignificant 
contribution that they have been omitted from the approximate wave function 
tabulation in Table 4. In general binding is discouraged because in the domi¬ 
nant states one of the atoms always has a closed-shell configuration. Since 
HeO exhibited the least repulsion an even more elaborate configuration 
interaction including different orbitals for different spins on helium was 
carried out with the result shown by the dotted curve in Fig. 12. 

3. The Ten-Electron Polyhydrides {19). We have employed MO SCF wave 
functions to predict the equilibrium geometry and properties of the sequence: 



CH^, and BHJ. For all of these which have known structures we obtain 
agreement with experiment on bond angles and bond distances to ±2%. 
There are obviously a great many other interesting properties such as dipole, 
quadrupole moments, proton affinities, etc., but we have selected two points 
to illustrate the type of chemical information obtained by virtue of carrying 
out calculations for sequences of molecules for a large number of geometrical 
configurations. It is of interest to organic chemists concerned with conforma¬ 
tional analysis to know the following trend in the HAH angle: 


ch 3 - 


NH, 


H a O' 




The fact that H 3 0 + would be planar in the gas phase (with a very shallow 
potential curve) is not apparent from crystallographic studies where it always 
appears bent. The trend in angles is matched by decreasing s character of the 
s-p hybride on the central atom as one goes to the right, thereby justifying the 
qualitative chemical rule relating bond angle to s character on the central 
atom. In Fig. 13 we show two potential curves for angle bending in the gaseous 
water molecule. What is interesting here is the coupling between bond length 
and angle showing that one gets a quite wrong estimate of the energy required 
to change the angle in water from 104.5° to the tetrahedral configuration if 
one considers the angular variation of potential energy with fixed equilibrium 
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Fig. 13. Potential energy versus angle in the water molecule. 

bond length. We estimate that the energy required for bending from 104° to 
109° is in the order of 1 kcal or less, and this can be important for theories of 
the structure of water and ice. 

4. The Geometry of Molecules (20). A number of years ago Walsh (21) proposed 
a set of orbital energy versus angle graphs for AH 2 , AH 3 , AB 2 , AB 3 , ABC, 
H 2 AB, HAAB, etc., systems. These curves, based on spectroscopic evidence 
and qualitative reasoning, have proved to systemize and predict the shape of 
a very large number of molecules. In fact, Walsh’s diagrams have become one 
of the most celebrated hypotheses of inorganic stereochemistry. 

Because the sum of one-electron orbital energies does not equal the total 
energy in Hartree-Fock theory it has been felt (22) that the dependent variable 
on these graphs could not be the one-electron eigenvalue of the Hartree-Fock 
equations. However, we now have accurate data for a large number of 6, 8, 
10, 12, 14, and 16 electron molecules of types AH 2 , AB 2 , AH 3 , AB 3 , ABC, 
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and we find that in fact the one-electron energy is the appropriate mathe¬ 
matical and chemical dependent variable. For all of the various types of 
systems our one-electron energy curves show the same general form as those 
of Walsh while this is not true for that one-electron energy quantity, e h in 
Hartree-Fock theory which adds up to the total energy 

£r = i(j>, + (f|/IO) )=!*,- 

Figure 14 shows the original Walsh diagram for an AH 2 molecule super¬ 
imposed on our calculated curve for BH 2 . (The apparent discrepancies are 
primarily caused by Walsh’s incorrect omission of 2s orbital contributions on 
the central atom at 90°. This error was pointed out some time ago by Mulliken 
and when compensated for doesn’t change the arguments). The basic reason 
why the sum of one-electron energies may be used to simulate the predictions 
of the total energy is because there is inherently a low information content in 
the determination of bond angles. This can be shown by analyzing the energy 
expressions in terms of the mathematical inequalities: 
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If either of these hold then the sum of the one-electron energies will give 
the same prediction as the total energy. These inequalities are in fact satisfied 
for almost all molecular wave functions. Curves of the two quantities are 
shown for a typical AH 2 casein Fig. 15. In Fig. 16, weshow a curve for atypical 
AH 3 system (BH 3 ). The solid lines are calculations that allow adjustment in 
the tails of the atomic orbitals while the dotted lines are the frozen LC (Har- 
tree-Fock) AO result. We see that they both give a satisfactory representation. 


5. Electron Deficient Species (23). Molecules in which there are a larger 
number of available orbitals than available electrons are, in some sense, 
midway between the typical saturated covalent electron-pair bond compound 
and a metal. They often play an important role as organic intermediates. One 
such species is CH 5 . We have found, in contrast to previous speculations and 
calculations, that the trigonal bipyramid D 3h is not the lowest energy con¬ 
figuration but rather C 4v is lower. There may be another lower symmetry 
form with even lower energy and we are continuing our search for this possi¬ 
bility. The interesting feature here is that both the D 3h and C 4v forms have 
perfectly well-defined minima in their potential energy surfaces and one can 
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Fig. 14. Walsh’s diagram for AH 2 species (dotted lines) and one-electron energies 
versus angle for BH 2 (solid lines). 

specify well-defined bond lengths. (For D 3 j,'. long bond = 1.16 A, short bond 
in plane = 1.13 A. For C 4v : the carbon atom is 0.4 A out of the plane with 
four bonds each of length = 1.15 A, the bond perpendicular to plane is 1.11 
A). The total energies of the two forms are very close, and this suggests that 
either form could exist in a given circumstance—the particular one being 
determined by the surrounding environment. The simplest carbonium ion, 
C 3 H 3 , is also an important organic reaction intermediate and there is no 
instrumental means for ascertaining its stability and geometry. Our current 
values show a C-C distance of «1.50 A and a remarkably high-binding 
energy of 29.2 eV (relative to the infinitely separated atoms). This is also the 
simplest species in which we can investigate the famous problem of bent 
bonds.” Although incompletely optimized as yet, our present calculations 
show a charge density maxima outside the triangle, away from the line of 
carbon centers. 

Another electron-deficient species for which we have carried out an 
extensive geometry search and analysis is diborane, B 2 H 6 . We have simul¬ 
taneously determined the energy and geometry for BH 3 , being careful to 
always maintain a perfectly balanced basis set between the two so as not to 
prejudice a comparison of the energies between them. Our results show 
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Fig. 15. Total energy versus angle compared with the sum of one-electron energies 
versus angle for a typical AH 2 species (BH 2 )- 


diborane to be a little over 1 eV lower in total energy than 2 BH 3 molecules, 
thus supporting some recent mass spectrometric and kinetic studies of this 
hard-to-measure system. Another aspect of the diborane work has been calcu¬ 
lation of B 2 H 6 in the C 2 H 6 geometry and vice versa. The results can again be 
analyzed in terms of Walsh’s rules. 

6. Excited States of Simple Molecules (24). A long-standing challenge to 
quantum chemistry has been a priori prediction of the ground and excited 
states of CH 2 . A very accurate wave function calculated by Foster and Boys 
(25) has existed for some time, but this does not give the experimentally 
observed results of Herzberg (26). His results indicate a ground-state triplet 
approaching linearity rather than the 125° 3 B i ground state predicted by 
Boys’ work. It turns out that this is a rather subtle problem, and we are just 
now confident that we almost have the answer in hand. Figure 17 shows our 
present result with a minimum between 135-140° as the result of a complete 
VB configuration interaction with the frozen Hartree-Fock AO’s. Basis 
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Fig. 16. One-electron energies versus angle for BH 3 : dashed curves, LC (Hartree-Fock) 
AO; solid curves, AO’s allowed to adjust to molecular environment. 


function modification explorations already carried out for this and other 
studies clearly indicate that adjustments in the tails of the carbon 2s and 2p 
orbitals will yield a somewhat larger angle (perhaps to 160°) and this will be 
within the range of experimental uncertainty. The excited l A t state is correctly 
predicted (as it was by Boys) to be 103°. A number, not available from experi¬ 
ment but of interest to organic chemists, is prediction of the energy separation 
between the singlet and triplet (« 1.1 eV). 

The valence bond scheme provides a very good method for obtaining the 
valence state excitations of a system and Figs. 18-20 show typical curves for 
some diatomic species that have been measured spectroscopically at the 
National Research Council Laboratories in Ottawa. All experimentally 
accessible levels appear to agree well with our calculations. 

7. Rotational Barriers (27). We have been carrying out an extensive set of 
calculations on the molecules CH 3 CH 3 , CH 3 OH, and 0 2 H 2 (we also are 
working on singly and doubly fluorinated ethanes) to see if the origin of rota¬ 
tional barriers may be found within the Hartree-Fock approximation. Sine 
qua non to this investigation is a great deal of exploration as to the adequacy of 
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Fig. 17. Potential energy versus angle for first three states of CH 2 . 

the basis set and relation of the solution to a true molecular Hartree-Fock 
result. Perhaps most of all, a convincing explanation must correctly order and 
describe the barriers for a sequence of molecules—calculations on any one 
molecule by itself are inadequate. The necessity for obtaining theoretical 
numerical results on a sequence of systems is completely analogous to the 
tradition in experimental chemistry where it is well-established practice that 
meaningful conceptual understanding of phenomenon may be derived only by 
examining a given class, series, or sequence of compounds: this is particularly 
true for ethane because the high symmetry of the barrier tends to obscure the 
detailed mechanism of its origin. Computed results versus angle are shown in 
Figs. 21a, b,c for the three molecules noted above. In general, we get the correct 
ordering and reasonable magnitudes for the barriers, and analysis of our 
H 2 0 2 solution gives confidence that further improvements in our wave 
function will correct our present values around 180°. Thus we have accom¬ 
plished our first objective of proving that the origin of rotational barriers may 
be found within the framework of a molecular Hartree-Fock solution. In 
addition we have further decomposed the energy into nuclear-electron plus 
kinetic energy and electron-electron repulsion components. These two 
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Fig. 18 . Potential energy curves for ground and excited states of BH. 


components are found to be always out of phase with one another, and it is the 
detailed balance between these two which is basically the origin of the barriers. 
At present we are constructing a simplified model in terms of localized orbital 
contributions which will serve to organize and unify our results. Because of 
the vast amount of previous work on this problem it is worth noting from our 
results that the nuclear-electron and electron-electron components are always 
of greater magnitude than the nuclear-nuclear repulsion terms and that the 
phase relationship between the nuclear-electron and nuclear-nuclear repulsion 
terms changes from molecule to molecule. Lone pair electrons also play an 
important role in the H 2 0 2 case. It is thus apparent that a screened nuclear- 
nuclear repulsion model or a hydrogen-hydrogen only interaction model 
cannot elucidate the origin of rotational barriers. 

8. Many-Electron Energy Bands for Small Crystals (28). There are certain 
problems in solid-state physics that may be approached by the many-electron 
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R 

Fig. 19. Potential energy curves for ground and excited states of NH. 

techniques built up for molecules. In the future this may well provide an 
entirely new viewpoint for the electronic structure of solids. We have construc¬ 
ted an LC (Hartree-Fock) AO SCF MO wave function for a 32-atom hydrogen 
solid for sc, fee, and bcc lattices. At this stage the surface-to-volume ratio is 
not particularly favorable, and it proves technically impossible to construct 
Bloch sums and put in periodic boundary conditions, but we are able to 
watch the buildup of bands and we are going to be able to go to larger systems. 

9. Inorganic Compounds Composed of First and Second Row Atoms (29). In 
addition to the molecules we have discussed, there is a great deal of inorganic 
chemistry to be obtained from combinations of the atoms from H to Ne. As 
one can imagine, there are many exciting opportunities for understanding the 
subtleties of electronic structure in the following species (for which we have 
carried out high-precision wave functions as a function of geometry): Li 2 0, 
LiOH, HOF, F 2 0, HCN, N 2 H 2 , N 2 H 4 , N 2 F 2 , N 4 , NJ,N0 2 , NOJ, CO, C 3 , 
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Fig. 20. Potential energy curves for ground and excited states of HF. 


0 3 , C0 2 , and 0 2 F 2 . Incidentally, the first two molecules violate Walsh’s 
rules since these rules would predict a bent species. Experimental molecular 
beam results and our calculations show a linear Li 2 0 molecule. Our calcula¬ 
tions yield a linear LiOH molecule also and this stands as a before-the- 
experiment prediction. 

E. Decomposition analysis for many-electron wave functions 

It is obvious from our experience that chemically and physically interesting 
systems made from aggregates of atoms three, four, or five times larger than 
our examples will be inaccessible to direct a priori calculations because of the 
enormous number of three- and four-center integrals. Our first approach to 
this problem has been to look for an approximate physical or chemical 
relation that would avoid direct computation of these integrals. This approach, 
of course, is implicit to all existing model theories. But none of the present 
model theories really have an a priori predictive capability, and detailed 
examination of them shows that again it is just on the question of three- and 
four-center electrostatic interaction integrals where no satisfactory answer is 
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Fig. 21a. Barrier to internal rotation in ethane. 



Fig. 21b. Barrier to internal rotation in methyl alcohol. 
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Fig. 21c. Barrier to internal rotation in H 2 0 2 . 


available. We have also made rather extensive tests of the well-known Mul- 
liken approximation, 


Xi0)x/0 = 5, 


x?0) + xK 1 ) 


lj 


for three-dimensional molecules where there is no simple separation of n and 
o electrons (30). In summary our conclusions are that, if employed for a single 
pair of orbitals, the results are roughly equivalent to those obtained with the 
use of single exponential basis orbitals—just a little too erratic to be used with 
confidence a priori. Using Mulliken’s approximation for both pairs of 
orbitals is generally worthless from an a priori standpoint. However, even 
if this approximation were completely adequate it would not help because the 
central question is simply the number of integrals occurring— in fact our 
method of generating three- and four-center integrals is even a bit faster than 
use of the Mulliken approximation. The value of the Mulliken rule is that it 
approximately relates the magnitude of three- and four-center integrals to 
two-center integrals, but this relation is only approximate and we have not yet 
developed a practical method to use this knowledge in reducing the number 
of three- and four-center integrals. 

Another kind of data which is potentially useful is the distribution functions 
for the magnitude of molecular integrals in typical cases. Here again we meet 
the relatively crude state of the art in treating problems primarily characterized 
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Fig. 23. Distribution of molecular integrals with energy range: Dashed lines, three- 
and four-center integrals; solid lines, one- and two-center integrals. 









A New Approach to Many-Electron Theory at Intermediate Densities 73 


by their complexity: Until now there simply has not been the capability for 
generating enough of the basic data required for developing effective para¬ 
metric and model theories. Figure 22 shows the effect of different orbital basis 
sets on the distribution function for the sum of all integrals in a molecular cal¬ 
culation of NHJ at its equilibrium separation (31). We have computed such 
distributions for almost all of the ten-electron polyhydrides and for such 
diverse other species as Ne0 2 , C 3 H 3 , H 4 , and N 4 . All have much the same 
appearance with maxima at nearly the same magnitude. Crude basis sets give the 
same general shape but contain many erratic excursions—it is as if improve¬ 
ment in the quality of the basis set corresponded to putting the distribution 
function through a low-pass filter. Fig. 23 for H 3 0 + (at equilibrium geometry) 
separates the three- and four-center integral distribution functions from the 
one and two (31). Again the curves appear to have a universal shape. Un¬ 
fortunately, the three- and four-center integrals have too broad a peak with 
a maximum that occurs almost over the maximum of the one- and two-center 
curves (on the average it is even slightly shifted to the right of the example in 
Fig. 23). Simple addition theorems (e.g., the sum of the negative one-, two-, 
three-, and four-center integrals does not very closely equal the positive three- 
and four-center integrals) do not appear likely. When bond distances are 
expanded or contracted the distribution function moves to the left or right as 
might be expected. Only at quite short distances does distortion in the shape 
show up, giving rise to bimodal distributions. While interesting, these distri¬ 
bution functions have not as yet provided a scheme for reducing the number 
of three- and four-center integrals in any general way. The principal applica¬ 
tion of these distribution functions that can be anticipated at present is their 
use as criteria for deciding the number of significant figures required in the 
generation of various classes of two-, three-, and four-center integrals. 
Although not presently employed in any digital computer program in any 
laboratory, it should be possible to compute different integrals to different 
accuracies and still end up with the total energies computed to the present high 
accuracy standard of eight significant figures. Crude estimates indicate that a 
well-written digital computer program based on this principle might enjoy an 
additional order of magnitude in speed over existing programs. In order to 
increase the number of potential internal relationships among the integrals one 
needs to assume a particular model for constructing the wave function. The 
only simple model that has enough a priori generality to be useful is the 
MO SCF method (the number of important configurations varies too much 
from molecule to molecule in a VB wave function). Thus we are constructing 
term ratios and distribution functions for our SCF wave functions. For 
example, it may be that significant and general numerical relationships will 
show up in the final molecular Hartree-Fock solution that were not present in 
the input ingredients, and thus we are separating three- and four-center 
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contributions from one- and two-center contributions in the matrix elements 
of the SCF Hamiltonian. 

IV. The Method of Molecular Orbitals-Ionic States 

From the standpoint of formal many-electron theory the results discussed 
in the previous section prove two very important numerical theorems. First, 
molecular Hartree-Fock solutions give a remarkably good description of the 
system in terms of predicted properties and geometries: quite a bit better 
than one dared hope even with a knowledge of Brillouin’s theorem and the 
extended Brillouin theorem. Second, the LC (Hartree-Fock) AO approxima¬ 
tion is a definitely good and almost always adequate representation of the 
molecular Hartree-Fock solution. Thus, to a very large degree, molecules 
really are made out of atoms. 

In addition to the lowest single configuration energy, the other well-known 
advantages of the Hartree-Fock solution are: (a) definition of one-electron 
orbitals and energies and identification of the energies with ionization poten¬ 
tials; (b) quantitative determination of hybridization effects in the most 
efficient manner; (c) maximum use of molecular symmetry and fewest number 
of assumptions required to determine molecular charge distributions. 

These overwhelming chemical advantages coupled with the relative com¬ 
putational simplicity of the single determinant SCF procedure virtually 
guarantee that practically all ab initio wave functions for large systems will 
employ an approximate Hartree-Fock solution as leading term. 

In view of the great practical success of the molecular Hartree-Fock 
solution we must carefully examine what specific chemical and physical 
effects require us to go beyond this level of approximation. Three effects are 
listed below, but it is apparent that they have a common mathematical 
origin. 

(a) Hartree-Fock solutions are inadequate for constructing potential 
surfaces. Although equilibrium bond angles and bond lengths are predicted 
to +2% the region of satisfactory representation is «R e ±15%. For 
internuclear separations larger than about 15%, rise of the Hartree-Fock 
potential surface to ionic states at infinite separation noticeably manifests 
itself. This behavior plus even small errors in electron correlation energy 
estimates can make it impossible to distinguish between a repulsive and an 
attractive potential energy curve. 

(b) It is well known that the extra molecular correlation energy produces 
20-80% errors in dissociation energy predictions (even to the extent of some¬ 
times yielding a negative binding energy as in F 2 and some polyatomics). 
Empirical correlation energy rules like those noted in the previous section and 
calibration of results through experience considerably lessen the severity of 
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this problem but slight erratic fluctuations remain (as in F 2 ), and this reduces 
the ability to sharply distinguish one type of atom from another. 

(c) From the textbook example of H 2 , and many more complicated cases, 
we know that even at equilibrium separations there is too large an ionic state 
contribution, that is, the occupancy of the atomic shell structure is improperly 
represented. In effect the Hartree-Fock approximation does not give an accu¬ 
rate enough representation of the oxidation state of an atom in a molecule. 
This can lead to the incorrect sign of the charge distribution as appears to be 
the case in CO. 

It is ironic that just as the LC (Hartree-Fock) AO MO SCF method 
is especially appropriate and useful as a starting point it is singularly 
poor as a basis for generating correction terms. Disadvantages of the MO 
superposition of configurations technique are: (i) There are many configura¬ 
tions which have the same energy and there is no adequate criterion for select¬ 
ing in advance those groups of configurations which are most important, 
(ii) The logical orbitals for constructing the configuration interaction are the 
unoccupied excited states of the finite expansion Hartree-Fock Hamiltonian 
but the shape and energy of these states often changes radically with occu¬ 
pancy. (iii) The one- and two-electron many-center integrals over atomic 
orbitals which serve as input data for generating the initial solution must 
be transformed to the final SCF basis for carrying out the configuration inter¬ 
action. Although the four index transformation required is mathematically 
simple it is frequently a very time consuming process, (iv) The MO configu¬ 
ration interaction scheme does not utilize the fact that the initial solution 
was LCAO. Closely related is the lack of chemical and physical interpreta- 
bility. 

The Molecular Orbital Minus Ionic States Method (MO-IS) has been 
conceived to avoid these difficulties in the MO-configuration interaction 
technique. The central fact that we can take from chemical experience and 
numerical experiments is the high degree to which the LC (Hartree-Fock) AO 
approximation is obeyed (52). But everyone is familiar with the LCAO 
expansion of the hydrogen molecule MO solution which shows the charac¬ 
teristic overweighting of ionic terms. In polyatomic systems the principal prob¬ 
lem occurs for terms in the LCAO expansion corresponding to pairs of ionic 
valence electron states on adjacent atoms. This difficulty can be overcome by 
subtracting off valence bond-like ionic states from the wave function. The 
MO-IS method is designed to accomplish this in the following manner: We 
start by calculating the standard single determinant LCAO MO SCF solution. 
If a true molecular Hartree-Fock solution is available for a starting point, so 
much the better. Next we set up determinants corresponding to ionic states or 
in some cases neutral atom states. The ionic states to be included correspond 
to the singly or doubly charged bonds between adjacent atoms into which the 
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molecular orbital solution will separate. An important feature is that for any 
given molecular problem there will be relatively few of these states, and so the 
number of configuration interaction terms in the total wave function will be 
kept small. Finally, the total wave function is now made up from a linear 
combination of the MO and the ionic states, each with a linear coefficient 
determined by energy minimization. The MO-IS wave function is simply an 
MO configuration interaction treatment which uses ionic and atomic-like 
states to mix with the parent MO SCF state. The principal new mathematical 
feature is adoption of a non orthogonal basis. In previous times this would have 
been considered an overriding disadvantage, but now that the technical 
facilities are available to handle this numerically, we can turn this feature to 
great advantage: it enables us to retain the identity of the atoms in the mole¬ 
cule. It is also to be noted that the particular aspects which characterize the 
MO-IS wave function are of a strictly molecular nature. At the atomic limit of 
infinite internuclear separations it reduces to a standard Valence Bond con¬ 
figuration interaction expansion. From a chemical standpoint most of the vast 
store of qualitative knowledge about the electronic structure of molecules is 
contained in atomic rules, and the MO-IS scheme offers a way to couple this 
chemical information into a well-defined mathematical formalism. At this 
point it is appropriate to inquire why one doesn’t use the VB method itself in a 
straightforward way. Extensive experience at Princeton and elsewhere 
demonstrates the following: 

(a) For even simple polyatomic molecules there are generally many deter¬ 
minants representing neutral molecules, all with rather similar energy deter¬ 
mined weighting coefficients, but none with anywhere near the weight of the 
Hartree-Fock solution in an equivalent MO Cl expansion. It takes many 
more determinants and greatly increased manipulative complexity to repre¬ 
sent the hybridization effects so efficiently accomplished by the parent MO 
state. Comparison of solutions for particular molecules by VB and MO 
methods shows that a rather large number of VB states are generally requ ired to 
yield the same energy as the Hartree-Fock state. 

(b) So many states are required (neutral and ionic) to assure a symmetric 
and unbiased description of the molecular-charge distribution that the size 
of system which may be treated is much more limited than for the MO 
method. The resulting wave function always has a somewhat lower energy than 
the Hartree-Fock state but analysis shows that the chemical effects represented 
by most of the VB states are satisfactorily treated by the single Hartree-Fock 
determinant. 

The MO and VB methods have opposing virtues and difficulties and the 
MO-IS method attempts to exploit the virtues of both (55). Most of structural 
chemistry is described and interpreted in a qualitative valence bond frame¬ 
work and it is important to see if the MO-IS method can be described in this 
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type of language. Basic to this attempt is recognition of a dichotomy of 
thirty years’ standing between the descriptive language of textbook chemistry 
and the fractional charge distribution picture suggested by molecular orbital 
theory. On the one hand chemistry has been successfully organized in terms 
of a periodic table of atoms with a specified integral number of valence 
electrons surrounding a given atom. For a particular atom this number may 
vary under different bonding conditions but always remains integral. On the 
other hand, molecular orbital theory distributes charge in fractional amounts 
on all atoms and bond regions throughout the nuclear framework. Indeed, 
as Linnett (34) has pointed out, “ It is interesting that exponents of the molec¬ 
ular orbital method, which provides the clearest way for constructing molec¬ 
ular wave functions, have never felt the need to provide chemical formulas 
which illustrate and symbolize the wave functions. ... It is very probable that 
one of the reasons why molecular orbital treatments were accepted only slowly 
by experimental chemists was that theoreticians were unwilling to devise 
chemical formulas to represent their ideas”. In this paper we do not pretend 
to directly solve this traditional and continuing problem but rather make some 
observations which appear to the present author to considerably reduce the 
difficulty. First, if one concentrates on the properties of the bonds rather than 
questions as to the oxidation number of the participating atoms, then to the 
zeroth approximation both viewpoints are identical and yield an answer 
expressed as an integral number of electrons. This result, of course, is just the 
simplest use of the bond order concept. Second, Linnett (34) has recently 
introduced an improved qualitative language and notation, based on the 
valence bond approach, which retains integral numbers of electrons but 
distinguishes between a-spin and /?-spin electrons. His scheme, termed the 
“Double Quartet” or “Non-Pairing” method includes the conventional 
“electron dot” picture as a subclass but is able to describe a far larger 
number of bonding situations—including, in fact, almost all known classes of 
chemical compounds. Although the electrons are loosely identified with 
atoms, this new description actually permits a far more flexible distribution of 
electrons around the nuclear framework than the traditional picture in terms 
of a rigid number of electrons identified with a specific atom. Linnett has 
devised a prescription for constructing a many-electron wave function from 
atomic orbitals corresponding to his symbolic notation. We have expanded 
an MO LCAO wave function for several of the systems treated by Linnett 
and his co-workers, and if we arbitrarily cross out terms in this expansion 
corresponding to pairs of doubly occupied atomic orbitals on adjacent 
atoms we find that the remaining wave function contains the same type of 
terms represented in Linnett’s wave functions. Qualitatively this is the same 
argument on which the MO-IS method is based, thus demonstrating the 
close similarity between his modified valence bond scheme and our modified 
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molecular orbital scheme. This further suggests that MO-IS wave functions 
will lead to useful pictures for descriptive chemistry. 

The numerical procedure for computing MO-IS wave functions is for¬ 
tunately straightforward and makes use of well-established digital computer 
routines (55). First, an LCAO MO SCF solution is obtained (36). Second, 
ionic valence bond states, and perhaps others selected on chemical criteria, 
are set up as determinants made from atomic orbitals. Third, matrix elements 
of the Hamiltonian between determinants are calculated in the usual way 
using the same one- and two-electron integrals computed over atomic orbitals 
required as input data for the SCF solution. The only new kind of matrix 
element is that between the MO state and a valence bond determinant, but 
this simply involves linear combinations of integrals with already determined 
coefficients. Of course, the orbitals in the ionic state terms are not orthogonal 
to those in the MO state, and this means that it is necessary to use the overlap 
determinant formalism for the matrix elements (57), but this occurs already in 
the valence bond method itself and, as discussed in theprevious section,this tech¬ 
nique, like the MO SCF procedure, has been thoroughly reduced to practice. 

At the time of writing MO-IS results are just being completed for NeH 2 , 
LiH, Li 2 , C 2 , N 2 , and F 2 . Although not yet completely analyzed, the results 
are very encouraging for all cases—an especially gratifying situation because 
MO-IS is basically a per bond, left-right correlation scheme and thus an 
almost complete test of the method can be achieved by calculations on di¬ 
atomic species alone. It is also important to note that spin symmetrization 
problems in MO-IS enjoy the same simplicity as in the standard MO method. 
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I. The Exclusion Principle and Intermolecular Forces 

It is almost an axiom of science that the interaction energy between two 
atoms, and hence the forces between them, shall vanish when the distance 
between them becomes very great. For if this were not true the universe would 
contain no isolated systems and the idealizations implicit in every physical 
science become meaningless. In classical physics this belief is clearly warranted, 
for it is known that all forces between the elementary constituents of nature 
fall off with distance R between interacting partners at least as rapidly as 
R~ 2 , those between nuclear particles much more rapidly, and the problem of 
isolation presents no difficulty. But in quantum mechanics, which requires the 
use of Pauli’s exclusion principle, curious features appear which, unless fully 
understood, cast doubt on the existence of isolated systems containing similar 
particles (e.g., electrons). Perhaps because they give rise to some perplexity 
these features remain undiscussed. In this article the author wishes to expose 
them and show that they disappear under an analysis which involves basic 
aspects of the theory of measurement; this analysis, in turn, places the 
exclusion principle in a new light and raises questions of some interest. 

The situation at issue arises in the theory of intermolecular forces and will 
here be sketched in broad outline. For details Hirschfelder s (1) impressive 
book, the review articles by Pitzer (2) or this author (5) may be consulted. 

Let two interacting atoms be labeled a and b. Atom a contains m electrons, 
atom b, n electrons. Their energies are, respectively, 

* Work done under Contract AFOSR 249-64. 
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m m 

H a = T a - z a e 2 £ C/ + e 2 Z 'T/, (1) 

A = 1 A> /i =1 

H b = T b — z b e 2 t r^+e 2 r^\ (2) 

A = m + 1 A > /i = m + 1 

while their energy of interaction is given by 

m + n m m m+n 

v - - V 2 I Cl' - V I hx + e 2 Z Z + V„e 2 R-' (3) 

A = m + 1 A = 1 A — 1 /t — m + 1 

As to notation, T a and T b are the kinetic energies of all electrons in atoms a and 
b; in terms of the momentum p { of the ith electron and its mass m, T a = 
Za=i/ ,2 a/2w; r ai is the distance of electron / from nucleus a, r tJ the distance 
between electrons i and j. Finally, R is the separation of the nuclei of the two 
atoms. 

We note first that V vanishes when R becomes very large, provided the 
electrons remain attached to their parent atoms, for in that case every inter¬ 
atomic r tj —and only these occur in V —becomes infinite. This is the reason 
for the classical result that the interaction between distant atoms tends to 
naught. 

The story is not much different in quantum mechanics, provided each 
atom is treated as an identity and the full requirement of the Pauli principle 
is not imposed. The latter must be applied, of course, to the electrons of each 
atom separately in order that the result shall have even a semblance of validity. 
The customary method is this. One writes a product of orbitals for atom a, 
one for each electron, viz., 

<P a = a 1 (l)a 2 (2)a 3 (3) ••• a m (m) (4) 

and likewise for atom b : 

<Pb = bi(m + l)b 2 (m + 2) ••• b„(m + ri). (5) 

Each of these is then “ antisymmetrized ” by applying operators 

^ a = Z(-l)^ (6) 

x 

and 

^ 6 = I(-1 YP b , (7) 

to the product functions. In these formulas, P° and P b stand for all permuta¬ 
tions among the electrons of atoms a and b, the subscripts label a given permu¬ 
tation, and are taken to be even for even, odd for odd permutations, i.e., for 
permutations composed of an even or odd number of elementary transposi¬ 
tions. The individual atomic functions then become 
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. <A/> = J* b <Pb ( 8 ) 

and the energy of atoms a and b, in the approximation provided by this 
choice of orbitals, is 

_ f 'K H b'h dx 

h a = y — > h b = j . (9) 

J \K'l'adx j 

Here dx a and dx b are volume elements in the spaces of the electrons of atom 
a and of b ; their product will be written dx. 

The interaction energy is 

f 'I'Z'l'tV'I'a'I'bdx 

v-T 2 - 7 -, ( 10 ) 

j'I'a'I'a dx a jij/Z<J/ b dx b 

and its properties are in accord with classical expectations, in particular, 
lim^^y = 0. Equation (10) is indeed used in the calculation of long-range 
interatomic forces and gives correct answers. But for small values of R it is 
woefully wrong, for it neglects exchange forces. 

To render a proper account of them it is necessary to antisymmetrize the 
state function of the entire system, which is composed of a and b. This in¬ 
volves the conjunction of a further antisymmetrizer, sd ab , with sd a and sd b \ 

sd ab = Y J (-iyp?> 


where P ab is the complex of intermolecular electron permutations, 
( w + n )l/mlnl in number, which exchange electrons between a and b. The 
complete state function, in this approximation, then, is 


'¥ = s/ ab ilfjlf b . 

(11) 

With it one can calculate the expectation values 


H a =j'¥*H a x P dx/ j 4 '* 4 / dx, 

(12) 

H b =/'¥*H b '¥ dx/j dx. 

(13) 

V=j'}>*VV dx/j^V dx, 

(14) 


and these, in the literal interpretation of the elementary axioms of quantum 
mechanics, should be the energy values observed on the average when measure¬ 
ments are made. But it turns out on computation that none of the quantities 
H a , H b , or V shows the correct behavior when the atoms are at an infinite 
distance from each other; H a does not approach h a , H b does not approach 
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ft b , and Kdoes not go to zero. What does happen, and what saves the calcula¬ 
tion from absurdity, is that the sum of these quantities 

H = H a + H„ + V (15) 

reduces to h a + R b as R -> oo, so that the remainder, 

H-h a -h b = AE 

can be regarded as an interaction energy. Even this quantity, in spite of its 
correct asymptotic behavior, cannot be guaranteed to be the actual potential 
energy between the atoms, nor can it be placed between rigorous mathematical 
limits. 

Evidently, H a , H b , and Fare physically meaningless although they seem 
to satisfy the rules of computation. The mathematical reason for this collapse 
of meaning is easily discovered. The operator H a is invariant with respect to 
all P a , H b with respect to all P b , V with respect to P a P b , but none of them is 
invariant with respect to P ab . Thus each of them commutes with s4 a s4 b , but 
not with s4 ab . When expectation values are computed with a state function 
which has the symmetry imposed by stf ab , a symmetry which the operators do 
not share, awkward additional terms appear. The sum of (1), (2), and (3), 
the total energy H, however, is invariant with respect to the symmetric group 
on all m + n electrons, i.e., with respect to P ab as well as P a and P b , so that in 
its computation the awkward terms of H a , H b , and V cancel out. Nevertheless 
there remains a basic physical paradox attached to the unwelcome features 
of these individual quantities, and the remainder of this paper seeks to 
interpret them. In the next section we illustrate what is involved by reference 
to a simple example. 

II. Interaction of Two Hydrogen-Like Atoms 

Let the nucleus of the first atom be situated at the point a, that of the second 
at b. We shall further designate the orbital function localized about the point 
a by the letter a, so that «(1) is the state function of electron 1 about the proton 
at a. The nuclei carry charges z a and z b . The functions u(l) and b( 2) satisfy the 
equations 

// fl «(l) = £ a u(l) and H„b(2) = E b b(2); (16) 

their form is well known, and 

(17) 

Z 'a\ ^ r b2 

when written in atomic units. 

In this case i {/„ = a, if/ b = b, since in the presence of a single electron s4 a 
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and sd b are 1. But $£ ab = 1 — T 12 , where T 12 effects a transposition of electrons 
1 and 2. We then have 

h a = E a , h b =E b , (18) 

and these values are in the present instance correct without approximation 
because a and b satisfy Eqs. (16) exactly. Furthermore, 

n.-jnH.udzfjirdz, 

where each bracket contains the quantity [a(\)b{2) — Z>(1)«(2)], which is real. 
Upon expansion, and with the use of a well-known notation we obtain 

H a ={(a\H a \a) - 2{b\H a \a}5 + (b\H a \b}}/2(\ - 5 2 ) 

= {£„( 1 - 2<5 2 ) + (b\H a \by}/2(l - S 2 ), (19) 

where <5 is the overlap integral J a(l) 6(1) dx i . This would be the expected 
E a if the term (Jb\H a \b') were replaced by < a\H a \a ). Let us therefore examine 
that term. Since, by (17), 

H a (D = H b (l) + ^-^ 

r bl r al 

we have 

(b\H a \b} = (b\H(b) + Z -±-^\b} 

r b r a 

= E b + (b\z„/r b \b} - (b\zjr a \b). (20) 

Below we shall need the analogous form 

{a\H b \a} = E a + (a\zjr a \a} - {a\z b /r b \a}. (21) 

For the special case of hydrogen-like atoms, (b\z b /r b \b} = —2E b ; in any case 
this quantity is independent of R, while {b\z a /r a \b} vanishes when R becomes 
infinite. 

In the same way one finds 
2(1 - 5 2 )H b = E b ( 1 - 2d 2 ) + <a | H b a > 

2(1 -S 2 )V= 2(ab\r^\aby - 2{ab\r^\ba) + 2(1 - 5 2 )i? _1 

-<«| zjrja) - {b\z b /r b \b) - < b\zjr a \b > - {a\z b /r b \a} 
x 2(_b\zjr a \a}5 + 2{a\z b /r b \b}5. 

This last expression arises from the classical potential energy 


(22) 
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As R-> oo, <5 as well as {b\zjr a \b} and <a|z 6 /r 6 |a> vanish and V reduces to 
the two terms - (a\zjr a \a}-(b\z b /r b \by, which are large. They arise from the 
action of T 12 on ab. They are precisely the terms which survive in H a + H b , 
but with opposite sign, so that all large components of H cancel. Indeed 



We now return to consider the separate expectation value H a which, 
according to Eq. (19), has the limiting form 


lim H a = \[E a + <b\H a \b}]. 


(23) 


From experiment we know that, if atom b is infinitely far away, the measured 
energy is E a in every observation, yet (23) says that we find with equal prob¬ 
ability the values E a and (b\H a \b}, of which the latter would be the average 
energy of atom a if its electron were situated about b. Does the Pauli principle, 
which is responsible for this result, cause an obscure effect to be exerted on 
atom a by b even when it is infinitely far away? And if there were many other 
atoms, c, d , e, etc., all at infinity, would they cause the energy of a to be an 
average value of E a and (b\H a \by, < d\H a \dy , etc. ? The answer would 

be affirmative if the Pauli principle were a universal fact of nature, inviolable 
in its stringency upon all constituents of the universe. Light is thrown upon 
this state of affairs if the meaning of an expectation value like R a is examined 
from the point of view of von Neumann’s theory of measurement. 


III. Theory of Measurements 


A brief review of the relevant parts of measurement theory will now be 
given. We preface it by recalling some elementary points concerning the 
probabilities of quantum mechanics {4,5). When a physical system, like our 
atom a, whose coordinates we shall continue to designate by (1), is in a pure 
quantum state such as is represented by 1 ^( 1 ), the expectation value of any 
observable (7(1) which can be measured upon it is <i^JG|^ a >/<i/f a |^ 0 > = G. 
Henceforth we assume (\J/ a 1 1 p a y to be 1. Another way of writing G involves 
the statistical matrix p; it is 


G = Tr(pG), 

Tr = trace = diagonal sum. 


(24) 


In this expression G is the matrix of the operator G in some orthonormal set 
of basis functions « A (1), G u = J ufGuj dr. On the other hand, p is constructed 
from the probability amplitudes c x , defined through 


<A = £ c x u x 
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by the rule p;j = c t c*. This matrix satisfies the relation 

P 2 - P, (25) 

and Tr p = 1. The state is called pure in the quantum sense because it 
conveys maximal knowledge, knowledge not mixed with ignorance, about 
system a. To be sure, one knows the occurrence of the values of an observable 
like G only with probabilities, but these probabilities are irreducible by physical 
operations in a sense discussed in Ref. 5; the uncertainty principle sets the 
ultimate limit of all possible reductions, and that principle is embodied in the 
representation of the state as i\> a . Whenever the probabilities residing in a 
quantum mechanical state are thus irreducible, its statistical matrix satisfies 
Eq. (25), even if its origin, its construction by the rule just given, is not evident; 
p is then said to be an elementary matrix or a projection operator. 

Now it often happens, for instance in statistical mechanics, that we do not 
know whether system a is in fact in the quantum state i j / a . Our ignorance may 
commit us to saying that it is in a state i/^ X) with a probability co lf in «A< 2) 
with probability co 2 and so on. These probabilities, which are not necessarily 
the squares of quantum amplitudes but have a simple classical origin, bespeak 
a removable kind of ignorance, they are reducible at will by observations or 
physical manipulation. Yet, whether they are present or not, one can form the 
expectation value of G in the manner of Eq. (24), 

G = Tr(pG) (26) 

provided one defines 

p = co 1 p (1) + co 2 P (2) 4— (27) 

and constructs p (1) from i j/^, p (2) from i/^ 2) , etc. This p, however, will not 
satisfy Eq. (25); and it is said to represent a mixture (unless all but one of the 
CO; are 0). When a mixture is present, there always exist means—selection 
of systems from an ensemble or other forms of state preparation—whereby 
all co’s can be eliminated in favor of one, which then becomes 1. We thus 
obtain a pure case with irreducible probabilities. Whether a pure case is 
present can be ascertained theoretically by subjecting p to the test of Eq. (25). 

A measurement performs the opposite of such reduction: it converts a pure 
case into a mixture (in the Hilbert space of the system whose properties are 
being measured) ( 6 ). For let the system, our atom a, be in a state i/^( 1) before 
measurement. The measuring apparatus might be atom b, whose state before 
measurement is ^(2). Both of them, prior to the measurement interaction 
when they are effectively an infinite distance apart, are in a product state 

When the interaction takes place that state is changed to one which must be 
written 
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W1.2) = I«W1K(2) ( 28 > 

Xu 

where the u x , as before, label a complete orthonormal set in the space of a, 
while the span the space of b in similar fashion. The occurrence of the 
measurement interaction manifests itself in the nature of the coefficients 
c Xfl , which can no longer be written as products c^-c" as was the case before 
the measurement took place. 

If now we calculate (7(1) with (28), we get 


G= U*(l,2)(7(l>Kl,2)rfT= Y, c*f,c Vfl G Xfl , 

and this takes the form 


(29) 


G = Tr(p(7) (29a) 

if we define 

= 00 ) 

This statistical matrix no longer satisfies Eq. (25), as inspection will show. 
When describing system a alone, in its own proper Hilbert space, one must 
therefore conclude that the measurement has converted its originally pure 
state into a mixture. It maybe shown that the now reducible probabilities a>,-, 
which appear in this mixture, are the squares of the probability amplitudes of 
\J/ a in the orthonormal set which forms the eigenstates of G. After the measure¬ 
ment, and because the mixture contains reducible probabilities, one can by 
physical operations select from an ensemble of systems those which are in any 
one of the states t j/^, t/4 2) , etc. 

We shall now show that the Pauli principle “ fuses ” the states of atom a 
and atom b in the same way in which a measurement fuses the states of system 
and apparatus. 


IV. Pauli Principle and Measurement 

Combining the considerations of Sections II and III, we treat atom a as the 
system whose energy is to be measured, atom b as the interacting probe. To 
the operator (7(1) there corresponds H a ( 1), and the exclusion principle requires 
that we write ip (1,2) = [a(\)b(2) - b(l)a(2)]/[2(\ - <5 2 )] 1/2 . In the limit, as 
R->co, 5 vanishes while a and b become orthogonal. Hence, in the notation 
of Eq. (28), we are dealing with two orthogonal functions, 

u 2 — b 


«i = a. 
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and the set v is identical with set u, v y = a, v 2 = b. The matrix of coefficients is 



(30a) 


Calculation of H a leads to the result analogous to (29), and p {j is given by 


(30): 



The square of p is not p but p/2; hence we are dealing with a mixture involving 
reducible probabilities. The expectation value H a does not display maximum 
knowledge. It says: if you do not know (although you could!) whether an 
electron is attached to nucleus a or b, the mean value of its energy is given by 
Eq. (23). This result is not astounding; it brings together some stray bits of 
knowledge but it also stimulates reflections of a rather basic sort and leads to 
such conclusions as the following. 

First, we note once more that H 0 is not an ordinary quantum expectation 
value, for it involves removable ignorance. If sufficient knowledge is available 
our calculation of H a is incorrect, for it must then be computed in defiance of 
the exclusion principle, with the function a( 1)6(2) alone, for which p is an 
elementary matrix. One way of obtaining this knowledge is to subject atom a 
to a set of measurements (e.g., of its position relative to b, or of its energy) 
which give assurance of its isolation. 

This, in turn, suggests the heretic thought that Pauli’s principle does not 
have the universal validity with which it is usually endowed, for it can be 
breached by the selective procedures just cited. It imparts a measure of ignor¬ 
ance which can in principle be eliminated so far as a single system is concerned. 

For infinitely separated, i.e., noninteracting, systems this claim is true. But 
suppose the systems interact dynamically. It is then no longer possible to 
measure position or energy of a alone. And if, by way of a hypothetical initial 
condition contradicting this fact, we were informed that at a time t 0 electron 
1 was certainly attached to a, the presence of a finite V in the time-dependent 
Schrodinger equation would cause the fluctuations known to chemists as 
resonance; the initial function a(\)b(f) would soon transform itself into a 
fused function of the form (28) with coefficients periodic in time and forever 
unable to satisfy the Pauli principle, representing forever a nonstationary 
state. This, however, is not an indictment of the principle; it merely bespeaks 
an internal contradiction: to claim that the initial state is a( 1)6(2) implies a 
sharp energy, E a + E b , of the total system, yet the calculation winds up 
with a nonstationary state, in which the energy is not sharp. Resonance can 
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therefore not be interpreted as a periodic fluctuation between states which 
violate the exclusion principle. The only states of the combined system 
which are observed in nature are those which satisfy it. 

Every observable defined for the total system, atoms a and b, yields expecta¬ 
tion values with maximal information. A proper observable of this kind 
is necessarily symmetric with respect to an interatomic exchange of electrons. 
With respect to such an operator, say (?(1, 2) = G( 2, 1), the basis functions 
are U l = a(\)b(2) and U 2 = b(l)a(2); they are orthogonal in the space (1, 2). 

Hence, as R -> oo, ^(1, 2) : = y /l(U l — U 2 ), 

G(l,2) = Tr(pG), with />=(_! ”!)’ 

and this matrix satisfies p 2 = p. 

The potential energy V considered in Section 2 is not a proper operator of 
this kind. It has a symmetric part, V s = R* 1 + r^ 2 , and V s vanishes for 
infinite R. The remainder, according to Eq. (22), is of the form V a (l ) + V b (2). 
Its expectation value is 

F=Tr [p(V a +V b )l 

and p is again given by Eq. (31). Hence Fis not a pure-case expectation value; 
what was said about H a (and is true for H b ) holds here as well. 

If the distinction made in this article is ignored, the presence of nonvanish¬ 
ing terms in F, H a , and H b in the limit of infinite separation must be inter¬ 
preted as a physical influence between distant systems, which precludes the 
possibility of isolation. We have shown that these finite interactions are 
not physical at all but reflect subjective matters, removable ignorance. 

There remain, however, some fundamental questions regarding the exclu¬ 
sion principle. Evidently, physical operations can break its hold upon the 
states of infinitely separated systems. The union it provides between them is 
like that produced by a measurement, which imparts “ knowledge ” of system 
a to system b. In still cruder language, the operator stf ab when applied to a 
set of isolated orbitals seems to be an affidavit certifying that the occupants 
of these orbitals have “met” at some time in the past. And if all identical 
constituents in the universe require their states to be antisymmetrized, the 
implication of their having been in dynamic contact in the past is strong indeed. 

In this article we have dealt only with fermions. It is clear that symmetriza- 
tion has similar effects upon the statistical matrices as the action of stf ab . 
Hence bosons are included in our analysis. 
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I. The Newtonian Dynamics in Hamiltonian Form 

The motion of a free isotropic rigid body can be expressed in terms of the 
three coordinates q x , q y , q z , of its center of mass, canonically conjugate 
components of momentum p x ,p y ,p z , and components of angular momentum 
about its center of mass co x , co y> co z . These dynamical variables have Poisson 
brackets (p x , P x ) = 1> (tfy > P y) 1 > (*7z > Pz) I > (p^y > co x , (,co z , co^) COy , 

(i co x , cD y ) = co x , the remaining Poisson brackets vanishing. 

We may take as Hamiltonian functions to give displacements in position, 
orientation, velocity, and time, the usual components of linear momentum, 
angular momentum, mass times the coordinates of the center of mass, and 
energy, m being the mass: 

X = p x , L = q y p z — q z p y + co x , U = mq x 

Y = p y , M = q z p x -q x p z +co y , V=mq y , = — {pi + p y + pi). 

Z = p z , N = q x p y — q y p x + co z , W = mq z 

That the Poisson brackets of these functions are linear combinations of the 
functions themselves, in general inhomogeneous, with constant coefficients, 
are the conditions that the corresponding equations of motion admit a group 
with these coefficients as structure constants. In this case we have the structure 
of the Newtonian group. 
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II. The Hamiltonians for Special Relativity 

In order to admit the inhomogeneous Lorentz group rather than the New¬ 
tonian group, some Poisson brackets that vanished for the Newtonian group 
must no longer do so; we must have 

(V, Y) = pJT, 

c 

where c is the speed of light. 

These can all be satisfied if we generalize U, V, W, and Jf, and, with the 
other Poisson brackets, give partial differential equations to determine these. 
We find that we can take the usual form for XC as for a point particle 

= (m 2 c 4 + {pi +pj + pl)c 2 ) 112 

and then U, V, and W, which must form a vector, are found to be 


{V,W) = -^L, 

(w, u) = — \m, 

c 

(u, v) = -\n : 

c 


U = y 2 q x 3f- 

V = - 2 q y tf- 
c y 

W=\q 2 xe - 
c 


MyPz ~ (OzPy 
me 2 + 

<»zPx ~ WxPz 

me 2 + xe 
(o x p y — oj y p x 

me 2 + xe 


III. The Hamiltonians for the Dirac Electron 

The usual Hamiltonian for the Dirac electron may be written 


= -me 2 - cp l (a x p x + o yPy + a z p z ) 

where p,, p 2 , p 3 and a x , a y , a z are two independent sets of Pauli matrixes 
satisfying P1P2 = — P2P1 = ? P 3 > P2P3 = — P3P2 = *P\i P3P1 = — P1P3 = /p 2 , 
Pi = U P2 = 1 » P 3 = 1, and the like, and p x , p y , p z and q x , q y , q z are operators 
such that p x q x - q x p x = -ih, p y q y - q yPy = -ih, p z q z - q zPz = -if,, the 
remaining commutators vanishing. 
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If we now take 

X = p x , L = q y p z - q z p y + \ho x , U = (q x YP + jTq x )/2c 2 , 

Y =p y , M = q z p x - q x p z + \ho y , V = (q y XP + J^q y )/2c 2 , 

Z = p z , N = q x p y - q y p x + \ho z , W = (q z Xe + J^q z )/2c 2 , 

the commutators of these functions of operators give us back the structure 
of the inhomogeneous Lorentz group and show that the Dirac electron admits 
this group. 


IV. The Foldy-Wouthuysen Transformation 


We can make a unitary transformation of the dynamical variables which 
does not change p x , p y , and p z , but reduces XP to depending only on these 
and the new pf namely, 


/' = 


me 2 + E — icp 2 (o x p x + OyPy + o z p z ) 


{2 E(mc 2 + E)} 1/2 

l nuf±E_+ icpf^p^ + QyPy + g z p z ) | 
J \ {2E(mc 2 + E)} 1/2 /’ 


where E = + {m 2 c A + (pi + p] + pl)c 2 Y 12 . 
This gives 

Px = Px etc., 


mC 2 (°xPx + OyPy + °zPz) 

P'l = Pl-^T ~CPi - 


P 2 = P 2> 

me 2 

Pi = Ps-£ + W 1 


E 

(0 X P X + OyPy + O z P Z ) 


me 

° x = a x E 


2 C 2 P X (0 X P X + OyPy + G z p z ) CP 2 (PyO z -P Z Oy) 


+ 


E(mc 2 + E ) 


etc., 


C 2 (P V 0 Z - Pz O V ) G X C 2 Px (O xPx + OyPy + O zPz ) 

q' x = q x + h ? - hcp 2 — + n—p 2 -——-—-etc. 


2 E(mc 2 + E) 


2 E 


2 E(mc + E) 


Thus 

while 


XP = -E'p' 3 
X = p’ x etc., 

L = q'yP z - q' 2 p'y + %ho' x etc., 


r - -f> (g + - i h 


etc., 


and 
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where E' = E, and the commutators of the new variables must have the same 
values as those of the old. 

V. Comparison of the Classical and Quantum-Mechanical Equations 

If we use Dirac’s rule for the correspondence of a classical Poisson bracket 
to a quantum-mechanical commutator 

(A, B) = {AB-BA}/ih 

and write 

co* = \hc j' x 
oi y = \ha' y 
co z = \ha' z 

we find that the transformed quantum dynamical variables have the same 
Poisson brackets as the classical, while the Hamiltonian functions have the 
same form, except for the factor — p' 3 in , U, V, and W. 

Since p\ and p' 2 do not occur explicitly, we have two unconnected sets of 
equations for the two characteristic values — 1 and +1 of p' 3 , the first going 
in the classical limit to the classical equations, the second to the classical 
equations with the sign of m changed. 

Conversely if we take two sets of classical equations for the opposite signs 
of m and quantise each by Dirac's rule, the Foldy-Wouthuysen transformation 
will transform the result into the equations for the free Dirac electron. 
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The Hartree and Hartree-Fock (HF) wave functions for an atom or a 
molecule are usually derived by a variational procedure. This shows that the 
associated energy is stationary with respect to the permitted variation of the 
individual orbitals. But it does not show whether, or to what extent, this 
energy is an absolute minimum. A rather general discussion of this topic, 
under the heading “Stability of Hartree-Fock States” has been written by 
Adams (1962). It may be interesting, however, to discuss one particular case 
in a little more detail, since this can bring out points not so easily noticed in 
the more general discussion. The purpose of this note is to provide such an 
account for the ground-state 1 5 of atomic helium. 

In general, as Hylleraas pointed out many years ago the best orbital descrip¬ 
tion of this state is the open-shell function 

0 = u(l)v(2) + v{\)u{2) (1) 

in which u and v are the Is and ls / atomic orbitals, and we have omitted the 
spin factor as irrelevant to our discussion. However, the traditional Hartree 
function is the closed-shell expression 

0 = 0(1)0(2) (2) 

Since (2) is a particular case of (1), in which u = v = 0, it follows that the HF 
open-shell energy of (1) must be lower than the Hartree energy of (2). But 
this, by itself, does not decide whether in (2) we are dealing with a local 
minimum of the energy, or with a saddle-point. It is true that simple calcula¬ 
tions using approximate expressions for u, v, and 0 lead us to expect a saddle- 
point. The Hylleraas-Eckart wave functions (1930, 1932) put, in atomic units, 

u = e~ ar , v = e~ e \ (3) 

and the Kellner (1927) wave function puts 


0 = e" yr 
97 


( 4 ) 
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In (3) we obtain the energy E as a function E(a, P) of the variable parameters 
a, P ; and in (4) we obtain E in the form E(y). Since if a = ft = y (3) and 
(1) reduce to (4) and (2), it follows that 

E(y) = E(y, y). 

It has been shown by several authors (see, e.g., Silverman et al., 1960; Hurst 
et al, 1958; Shull and Lowdin, 1956; Scherr and Silverman, 1960) that the 
absolute minimum of E( a, ft) in the a/?-plane occurs at a = 1.1875, ft = 2.1832, 
and at the equivalent point with a and p interchanged. Contours of constant 
E( a, P) are symmetrical with respect to the line a = P, and along this line the 
minimum occurs at the Hartree-type function a = P (= y) = 1.6875. The 
general shape of the E( a, /?)-contours in Fig. 1 shows that the Hartree-like 
energy is indeed a saddle-point. A considerable variety of changes may be 
made in a and p, leading to a lowering of the energy, if we relax the condition 
u = v. 

This situation refers to the approximate wave functions of types (3) and 
(4). But it can be shown that much the same situation holds for the full 



Fig. 1 . Energy contours for the (l-yXn'f l S wave function of helium, using Slater-orbitals 
with exponents a and p. (The author would like to thank Dr. M. D. Poole for his help in 
making the calculations embodied in this diagram.) 
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solutions of the Hartree and HF open-shell equations. Let us write for the 
total Hamiltonian 

H = H(l) + H(2) + 1 /r 12 (5) 

where 



( 6 ) 


and let us suppose that in the wave function ip = u(\)v(2) + v(\)u(2) the 
atomic orbitals u and v are normalized (and real). Then if is the energy 
associated with t p, 


where 


(ip\py = 2 + 2{u\vy 


( 8 ) 


<ip\H\ipy = 2(u\H\u) + 2(v\H\v} + 4 (u \ v}{u\H\v > 


+2 (uv 


12 


uv ) + 2( uv 


12 


VU 


(9) 


may now be regarded as a functional of the atomic orbitals u and v in 
Hilbert space. Let us start from some assumed u, v and make changes 

u^-u + Au, v-*v + Av (10) 


where, to preserve normalization, (u \ Au) — 0 — (v \ Avy to first-order in 
Aw, At?. Then 

(*p + A\p\¥L\p + Alp} 

(\p + A\p | \p + A\p} (ip\ipy ' 


It follows that A E+ has the same sign as 

(P | ipy{2(Apmy + (Aip\n\Apy} - <^|H|^>{2<a^ i ipy + <a<a I a<a». 

CD 


As we shall see, we may not discard terms of the second degree in Ai p. Using 
(11) and (7) it follows that Ahas the same sign as 

2 <A^|H - E+ \\py + <Aip\H - Ej,\A\py. (12) 

If we start from the equilibrium (lj , )(U') situation, the first term in (12) 
will be zero; and the second term is then necessarily positive. But we do not 
wish to start there. Rather let us start from the equilibrium Hartree solution 
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u = v=<j), and consider in a little more detail the two separate terms in (12). 
We are now putting (to first order) 

i j/ = 2w(l)w(2) 

A0 = w(l)Av(2) + v(l)Aw(2) + Av(l)w(2) + Aw(l)v(2) (13) 

W = V = 0. 

It soon follows that in (12) the terms of the first degree in Aw, Av give 

8<0(l)Av(2)|H - £,|0(1)0(2)> + 8<0(l)Aw(2)|H - £,|0(1)0(2)>. (14) 
Now 0 is a function which minimizes the expression 

<00|H|00) 

<001 00> 

with respect to all changes A0 that preserve the normalization condition 

<0 I A0> = 0. 

This implies that 

<0A0|H|00> = 0. (15) 

In our case, as (10) and (13) show, we may use Aw or Av as particular cases 
of A0, since we are assuming throughout that w and v are separately nor¬ 
malized. Thus all terms of the first degree in Aw, Av in (12) vanish for all 
admissible Aw, Av. 

This leads us to terms of the second degree in (12). Straightforward sub¬ 
stitution shows that these may be written in the form 

2<0, (Aw + Av)|H — £J0, (Aw + Av)> 

-I- 2<0, (Aw + Av)|H — £^|(Aw + Av), 0) 

+ 8<AwAv|H - £^|00> (16) 

This expression does not vanish in general. Furthermore, it may be shown to 
have opposite signs for at least two distinct choices of Aw, Av. In the first 
place, if we restrict ourselves to Aw = Av, then we are dealing with the Hartree 
closed-shell, and not the HF open-shell problem; and we know that, for this, 
Amust be positive. In the second place, if we consider choices such that 
Aw = - Av, the first two terms of (16) vanish identically and (see below) the 
third term is negative. This implies that, if we start from the Hartree solution, 
there are certain changes in the atomic orbitals w and v which increase the 
energy, and others that decrease it. The Hartree energy must therefore be a 
saddle-point. 
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We could put this pictorially in terms of Hilbert space as follows. The 
atomic orbitals u and v could each be represented by some point in Hilbert 
space. On account of the normalization condition these points lie on the 
surface of a sphere of unit radius. In the Hartree solution both orbitals u and 
v are represented (Fig. 2) by the same point P. Allowed small variations that 
preserve the normalization of u and v correspond to small displacements of the 
representative point in some arbitrary directions lying in the tangent plane to 
this sphere. Our conclusion is that if both representative points move together 
from P, the energy will increase, whatever this direction may be, provided 
of course that it lies in the tangent plane. But if the two representative points 
move in diametrically opposite directions, the energy will decrease, whatever 
these directions may be. 

The two changes in u and v just described may be regarded as the extreme 
possibilities. Around each of them there must be a range of directions in the 
tangent plane of Fig. 2 for which the change in energy is of the same sign as 
for the extreme directions. There is a very close correspondence between this 
very general situation and the more restricted one represented in Fig. 1. For 
now points P in Hilbert space become points (a/?) in the plane of Fig. 1. But 
if we start from the Hartree point a = /? (= y), then any changes for which 
Aa = A/? lead to an increase in energy; and any changes for which Aa = — A/? 
lead to a decrease; and both extremes are surrounded by a range of other 
(Aa, A/?) with positive, or with negative, A E. 

To complete the proof we have still to show that if A u= -Av, the last 
term in Eq. (16) is negative. We have to prove that 

<(A u Au|H — 



Fig. 2. Hilbert-space representation of permitted changes in the atomic orbitals w 
and v. P denotes the Hartree orbital y, and PQ denotes Aw, PR denotes At;. Aw and Av 
lie in the tangent plane at P to the unit sphere. 
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is positive. On account of the orthogonality of A u and 0, this means, from 
Eqs. (5) and (6), that 



r 12 / J r i2 


is to be positive. Put p( 1) = Aw(l)0(l), and put V for the solution (tending to 
zero at infinity) of the Poisson equation V 2 V= — 4np, so that 



(18) 


The integral may now be written in the form 



(19) 



There are no singularities in V, so that by Green’s theorem this becomes 



( 20 ) 


where the surface integral, which is taken over the sphere at infinity, must 
vanish from a consideration of orders of magnitude. The new volume integral 
is necessarily positive, and our theorem is proved. 

A few brief comments may be added: 

(1) If we start from the Hartree solution u = v = 0, and then consider the 
small changes A u = —Av( = A0 say), it soon follows that 

0(1, 2) = {$(1)0(2) — A0(1)A0(2)} x normalizing factoi. 

Thus, this particular first-order variation in u and v leads to a second-order 
change in 0. We could, in fact, have built up our argument from this situation, 
in a similar fashion to that used by Adams (1962). 

(2) In the same way the one-particle density matrix becomes 


p(l; T) = {0(1)0(T) + A0(1)A0(T)5'}/{1 +S 2 } (21) 


where 


S = J{A0(l)} 2 Jt 1 . 


(3) Similarly the two-particle density matrix has a diagonal term 


p(l, 2; 1, 2) = 


0 2 ( 1 )0 2 (2) - 20(1)A0(1)0(2)A0(2) + (A0(1)} 2 (A0(2)} 2 

1 +S 2 


(22) 
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Thus both the density matrices are changed from their Hartree values by 
terms of the second order in A0. But we no longer have the relation, typical 
of a single-determinant wave function: 

p{\, 2; 1, 2) = p{ 1; l)p(2; 2) - p( 1; 2) 2 . (23) 

(4) The argument given in this note is easily extended to other situations 
where a closed subshell is split open, as in passing from wave functions of 
type (2) to those of type (1). 

Note added in proof: A more careful consideration of orders of magnitude 
shows that, although the conclusion of this paper is correct, the proof needs a 
little tightening up, insofar as the second-order terms are concerned. Thus, if 
we want the new orbital u + Am in (10) to be normalized, we may only put 
<m|Am> = 0 to first-order. This is clear from Fig. 2 in which the points Q and 
R lie on the unit sphere in Hilbert space, and therefore are only on the tangent 
plane at P to first-order. The result of this change is to replace Am in the 
fundamental formula (17) by Au — kf, where k is the overlap integral 
JAm . 0 dr. This new formula is correct to second-order, and leads to the 
stated result. 
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I. Introduction 

One of the most useful ideas in the quantum theory of atoms and molecules 
is that of the Slater determinant, which represents a many-electron wave 
function as a determinant of electronic spinorbitals. The exclusion principle 
is met by the antisymmetry of a determinant under an interchange of two 
rows; for a closed shell the singlet character is ensured by the rotational 
invariance of the form (a x /? 2 — /? x a 2 ), where (a x , /? x ) and (a 2 , /? 2 ) are a pair of 
spinors. 

It is the purpose of this article to advertise to molecular theorists the use 
of a closely related idea, that of second quantization. The methods of second 
quantization are universally applied nowadays to many-body problems, such 
as arise in the theory of metals or of nuclear matter; but so far they have 
found relatively little favor among theoretical chemists. There have been good 
reasons in the past for adopting elementary, rather than highbrow, methods 
for solving chemical problems; it is silly to use a steam hammer for cracking a 
nut. But theoretical chemistry is becoming more and more concerned with 
systems, such as large conjugated molecules or crystalline molecular complexes, 
in which the number of particles is infinite or indeterminate; and for such 
systems explicit wave functions are awkward to specify, and elementary 
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methods become clumsy in the extreme. It is then that the methods of second 
quantization are especially useful; but the technique can also be applied with 
advantage to finite systems, as I shall try to show in the following paragraphs. 

This article is shamelessly unoriginal; the main ideas in it were put forward 
and developed by Heisenberg, Jordan, and Dirac, in the early days of the 
quantum theory. But good ideas, like good wine, improve with age; and it is 
hoped that in his capacity as a distinguished teacher Professor John C. Slater 
will view with indulgence this attempt to guide the less erudite reader through 
the difficult initial stages of understanding creation and annihilation operators, 
and how to use them. 


II. Creation and Annihilation Operators 

Suppose that we are given a complete orthonormal set of spinorbitals for 
an electron, denoted by 


<Pi> <p 2 > ••• <p v > ••• 

Then if N electrons are present in the system, any pure state of the electrons 
can be expressed as a linear combination of Slater determinants, each having 
the form 


(TV!) -1/2 


<P i(l) 
<Pi(2) 


<P 2 0) 

<P 2 (2) 


<Pn( 1 ) 
<Pn(2) 


= l<Pl<P 2 ■■■ (p N >. 


<P iW <P 2 O0 ••• (p N (N) 


For each spinorbital (p v we proceed to define the creation operator <p v + by the 
equation 

<Pv + l<Pl<P 2 "•> = l<Pv<Pl<P2 "•>• 

In words, the creation operator <p+, applied to the N x N Slater determinant 
|<Pi ••• (p N y, converts it into the (normalized) Slater determinant | <p y <p 1 ••• <p N \ 
which has N + 1 rows and columns; except that if cp v occurs in the original 
determinant, the result is zero. Thus, for example, 

<Pi + l<P 2 > = l<Pi<P 2 >> 

<p + |<Pi> = 0, 


<P 2 l<Pi> = l<P 2 <Pi> = -|<Pi<p 2 >. 

It is convenient to introduce the concept of a “vacuum state”, denoted by 
the symbol | ). Then any Slater determinant may be generated by applying 
a succession of creation operators to the vacuum state, as follows: 

<Pl + <P^<P3 I) = ^1^2 l<P3) = ^l<P 2 <P3) = l<Pl<P 2 <P3>- 
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The anticommutator of and (p +, defined as 

(pt) = <Pv + + <Pv + <Pm ’ 

is readily seen to be zero. For if it is applied to a Slater determinant in which 
either </> or cp v appears, each term of the anticommutator produces zero; 
and if the Slater determinant includes neither <p M nor <p v , we obtain 

{<pt, (pv)\(pl(p2 •••> = I<P M <)VPi<P2 •"> + \<Pv<P»<Pl<P2 •••> 

= 0 . 

The annihilation operator <p~ is defined by the equation 

<Pv I<PV<P1<P2 •••> = \<Pl<P 2 •••> 

with the supplementary statement that if (p v does not occur in the Slater 
determinant, then the application of <p~ to the determinant gives zero. Thus, 
for example, 

<Pi>l<P2> = l<P2> 

<?2 l<Pl<P2> = -<?2 l<P2<Pl) = -|<Pl) 

(P7\cP2>=0 
< p ;\< pi > = i>- 

The anticommutator of <p~ and <p v + also vanishes identically: 

{<pf, <Pv - } = (pf(p~ + <Pv<Pv = 0. 

It may be verified without difficulty that if <pt is a creation operator, and <p~ 
is an annihilation operator referring to a different spinorbital, then the 
anticommutator of cp^ and (p v vanishes. For instance, 

{<Pl, <Pl}\<Pl<P2> = - < Pl’l < Pl) + (P 2 ' 0 = 0. 

But if a creation and an annihilation operator refer to the same spinorbital, 
then their anticommutator equals unity. For example 

{q>i, (p:}\cpx<p 2 > = 1 " 1 ^ 2 ) + <Pi = l < Pi < P 2 >- 

In general, then, 

{(pf, (P7) = <Pv + = 

A Slater determinant in which cp v appears is an eigenfunction of <p v (p v 
with eigenvalue 1. For example 

(pl(Pl\ ( Pl ( P2'> = <Pl 1^2) = \<Pl ( P2'>- 

Similarly, a Slater determinant in which <p v does not appear is an eigenfunction 
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of (p+(p~ with eigenvalue 0. Accordingly we designate <p*<p7 as the population 
of q> y , represented by the symbol fi v . 

A matrix element of (p +, or of (p~, between two Slater determinants, may 
be interpreted in either of two ways. Thus (<Pi<P 2 \<Pi l<P 2 > ma y be thought of 
as the scalar product of <<p x <p 2 1 and <pj|<p 2 >, or of (.(p^zWi with \cp 2 )- We 
obtain a consistent scheme if we define <<pj<p 2 ••■\(p + v as the complex conju¬ 
gate of (p~\ty\(p 2 •••>, and define <^> 1 <y> 2 as the complex conjugate of 
( Py\ ( P\<Pz'”'>' Thus when (pX acts to the left, it removes <p v from <<p v <p x <p 2 
••• | (or gives zero if q> v is absent to begin with); and when q>~ acts to the left 
it adds cp v to the complex conjugate determinant <<p x <p 2 ••• | (or produces 
zero if <p v is there already). 

It is usual, in the literature, to write a pair of creation and annihilation 
operators as q>\ and (p v , to emphasize their complex conjugacy. In this article 
we shall adhere to the notation <pX and (p~, in order to avoid confusion be¬ 
tween the operators and the spinorbital (p v to which they are related. 

The effect of q>X and (p~ on any Slater determinant of the (p v is thus uniquely 
defined. But these Slater determinants form a complete orthonormal set of 
wave functions for all possible numbers of electrons. We draw the important 
conclusion that the cpX and (p~ give a uniquely defined result when applied 
to any wave function whatever, and that products of these operators have 
uniquely defined matrix elements between all possible pairs of electronic wave 
functions. 


III. One-Electron Operators 


The particular usefulness of creation and annihilation operators is that 
any physical operator can be expressed in terms of them. Every physical 
operator is symmetric in the variables of all the electrons, and two types of 
operator are especially important in practical problems, namely one-electron 
and two-electron operators. 

A one-electron operator is an operator of the form 



where the sum is over all the electrons. Such an operator has the property 
that its matrix element between any two many-electron wave functions is the 
same as the matrix element of between the same two wave 

functions, where 



and we may therefore write the operator identity 

F = E 
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We shall not give the proof of this identity—the reader may care to construct 
a proof for himself—but will present some applications of it. 

A simple special case is that in which = 1, so that F is the operator 
representing the total number of electrons. Calling this operator N we obtain 

ft = Z <P? = I (pv (p7 = Z • 

fiv v v 

Thus any wave function for an A-electron system must be an eigenfunction 
of this operator with eigenvalue N. 

The dipole moment of a neutral molecule is represented by the operator 

i 

An alternative expression for til is therefore 

M = 'L ( Pt er ^<Pv’ = f <P*0 )fy>v(r) dr, 

where r^ is the transition moment associated with the spinorbitals <p M 
and (p v . 

The potential energy of the electrons in the coulomb field of an atomic 
nucleus, and their kinetic energy, may likewise be expressed in terms of the 
q>l and <p~: 

V = £ (pf v^(p;, v „ v = -Ze 2 \(pl(r)(l/r)(p v (r) dr, 

flV ^ 

f= Z > /„* = -(h 2 / 2m) \rt(r)V 2 <pAr)dr. 

MV J 

Very often one will be interested in the expectation values of one-electron 
operators for a particular state of the system, such as its ground state. Using 
the symbol < X > to denote the expectation value of £ in a particular state, we 
deduce that <F> must be the following linear combination of the expectation 
values of the operators (pl<p~: 

<F > = L4v«<Pv"> = Z/mvTvm* 

MV M v 

Here the matrix 

= <<Pm <Pv - > 

is called the one-particle density matrix, or the one-electron density matrix 
in the representation defined by the <p v . The point of defining the expectation 
value of q>l<p- as y Vfl rather than as y MV is that one can then express the 
expectation value of any one-electron operator F in the form 

<F> = trace(/y), 
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and this expression is conveniently invariant under an orthonormal transfor¬ 
mation of the spinorbitals <p v . 

To sum up this section, we have seen that it is possible to calculate the 
matrix elements of any one-electron operator if we can find the matrix 
elements of the products (p+(p~. Further, the expectation value of any one- 
electron operator for any state (pure or mixed) is determined by the expecta¬ 
tion values of the , that is, by the elements of the one-particle density 
matrix, defined as 

y„v = <<Pv>;>- 

IV. Two-Electron Operators 

A two-electron operator is an operator of the form 

<5 = Z dijy 

i<j 

where the sum is over all pairs of electrons. A particularly important two- 
electron operator is the coulomb repulsion between the electrons in a mole¬ 
cule; another such operator is the interaction between their magnetic moments. 
In terms of creation and annihilation operators, 0 may be written 

& = iE (pfvtdnv, tJP7<p7> 

//Vff X 

where 


9 = JJ <P*(i)<pt(j)gij<Pz(i)<P«(j) <hi (hj- 

The expectation value of 0 for a particular state is thus 

1 Z 9fiv,xo^xa,fi\ i 

//V< 7 T 

where 

is called the two-particle density matrix for the state in question. The reader 
should note carefully the order of the subscripts in these expressions. In 
r Wt „ v the first two subscripts indicate which row, and the last two which 
column, the element refers to. An invariant expression for <(?> is therefore 

(G) = trace (gT). 

Hence if we could determine the two-electron density matrix we could calcu¬ 
late the electron repulsion energy of a many-electron system in a given state. 
In practice, of course, this is usually impossible; but there is one kind of wave 
function for which T x<rltv can be expressed in terms of which is a much 
easier thing to handle. This is any wave function which takes the form of a 
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single determinant of spinorbitals, which need not be the <p v themselves but 
may be any orthonormal combinations of them. For such a wave function, and 
only for such a wave function, the following relations hold: 


(a) 


ia, fiv 


y in y rv 

y<rfi yov 


(b) the one-particle density matrix is idemponent, in the sense that 


y„ v • 

K 


In the particular case that the spinorbitals of the single determinant are a 
selection from the set <p v , the one- and two-particle density matrices take the 
specially simple forms 


and 


y fIV ^flV^V 

(5 Xfi & aV 5 ^dgfjn^y 


where n v is unity if <p v appears in the determinant, and 0 if it does not. Accord¬ 
ingly, the energy of a many-electron system with a single determinant wave 
function is 


^ S ^yhyy "t" ^ [IV , fIV 9flV,Vfl)> 

V flV 

where ^ V( „ V is the coulomb integral and g flVt Vfl the exchange integral between 
( p^ and cp v . The latter vanishes, of course, if <p fl and <p v have opposite spins. 

V. Molecular and Atomic Spinorbital Operators 

The results quoted in the preceding paragraphs are valid whatever the 
nature of the complete orthonormal set of spinorbitals, <p v . But when discus¬ 
sing molecules one may find it convenient to work with creation and annihila¬ 
tion operators defined in terms of molecular orbitals. For example, if we are 
interested in the electronic spectrum of a molecule in a closed-shell ground 
state, a natural choice of spinorbitals will be the solutions of the Hartree-Fock 
equation for the ground state. Alternatively, one could adopt Lowdin’s 
“ natural spinorbitals ” *J / V , which diagonalize the one-particle density matrix, 
in the sense that 

= °> »* v - 

At any rate, molecular spinorbital operators are very handy when it comes to 
specifying an excited state or calculating matrix elements between the ground 
state and the low-lying excited states of a molecule. For example, if i j/ lt 
and i// 2 , i//_ 2 are the highest occupied and lowest unoccupied molecular 
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orbitals of benzene (taken in conjugate complex pairs with the natural phase 
convention) then the singlet B lu state is, in molecular orbital theory, 

~k('l / 2<x l l / -hz + x l / p2 l l / -lp + l l / -2a l l / la + ^ - 2^ 

where |iV> represents the singlet A lg ground state. It would be difficult to 
find a more compact, but equally explicit, expression. Doubly excited states 
may likewise be expressed in the form 

or as linear combinations of such expressions. 

In actual applications of molecular orbital theory one expresses each 
molecular orbital as a linear combination of certain atomic orbitals: 

fi Z ^fiaXa • 
a 

Conversely, the basis atomic orbitals may be expressed in terms of a sufficient 
set of molecular orbitals (which we assume to be orthonormal): 

Xa Z v • 

V 

The overlap integral between x« and is thus 

<J 

Introducing the atomic spinorbital creation and annihilation operators Xa 
and xZ’ defined by 

*; = L 

V V 

we find that the anticommutator of x + and y - is 

{Xa’X 7} = Z = S 9X . 

MV 

Atomic orbital overlap thus modifies the commutation relations between the 
associated creation and annihilation operators, and one must be careful to 
take this into account when working with a nonorthonormal basis set of 
spinorbitals. 

There is, however, one kind of orbital theory in which one is not troubled 
by overlap, for the simple reason that one neglects it. A most successful theory 
of this sort is the PPP (Pariser-Parr-Pople) theory of planar conjugated 
hydrocarbons. Let us consider this theory for a moment and rehearse the 
argument which McLachlan used for establishing the pairing properties of 
alternant hydrocarbons. 

The theory concentrates on the n electrons, which are supposed to move 
over a basis set of orthonormal spinorbitals <p v , two on each atom (one with 
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spin up, the other with spin down). The effective Hamiltonian for the n 
electrons is, accordingly, 

^ = E 9tK*97 = i E <pt<pt9nv,«,97'l'7 

liv uvax 

and this simplifies to 

= E vtKvfc = iE <<?v + y» v (p7<p7 

/iv /iv 

if one neglects all two-electron integrals except the coulomb integrals ^ Vj ^ v = 
y „ v • Given the one-electron integrals h^ v and the two-electron integrals it is 
then a routine matter to find the eigenvalues and eingenstates of this Hamil¬ 
tonian. 

Now suppose that the system is alternant. Then one can divide the atoms 
into two sets, starred and unstarred, in such a way that the only nonvanishing 
one-electron elements h MV are (a) the diagonal elements and (b) off-diagonal 
elements connecting a spinorbital on a starred atom with one on an unstarred 
atom. Starting with the ^-electron Hamiltonian in the form 

# = Z 9ft h nfi97 + E 9t h n*97 + iE 9t 9t 7**9797* 

\l /i^V /iV 

we then proceed to transform it by stages. First we reverse the sign of all the 
spinorbitals on the starred atoms, and write the new spinorbitals as y v 
rather than cp v . This gives 

^ E x^ E x* hpyXv “h tE X/i Xv iVvXv X/t • 

ft ftv 

Next, we use the commutation relation an ^ rearrange the 

factors in each term, so that the creation operators appear on the right and 
the annihilation operators appear on the left. The result is 

w = E M 1 _ x7xt ) + E x7Kxt 

+ 2^ )VvO ~ Xu Xu “ Xv X\ "l" X\ Xu Xu X\ )• 

//V 

Finally we relabel x7 as co v + and x7 as s0 that the new operators may be 
thought of as creating or annihilating holes in the spinorbitals x v • Finally, 
then, 

3tr = E + E <o + *K<°7 + iE °>* + y**<»7<o7 

/i /i*v /XV 

+ E + iE VmvK 1 - 2®;®;). 

/i V 

In this expression for the Hamiltonian the first three terms on the right hand 
side have exactly the same form as the original Hamiltonian; the only 
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difference is in the last term. In this term the factor (h^ + \ V/iv) is assumed 
in the PPP theory to have the constant value U for a carbon 2pxc orbital, 
because h^ includes attractive contributions from the cores of neighbouring 
atoms, and these will be compensated by placing half an electron in each 
spinorbital on every neighbor. If N is the total number of n electrons and M 
the total (even) number of spinorbitals in the conjugated system, the last 
term in may be written 

£/1 [ 1 - 2(1 - «„)] = U(2N — M) = 2 UQ, 

where Q — N — \M is the electronic charge on the system. We conclude that 
apart from a harmless constant the Hamiltonian for an alternant hydro¬ 
carbon ion takes the same form whether it is expressed (a) in terms of the 
operators <p +, cp~ or (b) in terms of the hole operators a>7 , <x>~ , which insert 
or remove holes in ± <p v (the sign depending on whether the atom is starred or 
not). It follows that corresponding positive and negative ions have their 
states in a 1:1 correspondence, differing in energy by 2 UQ. 

This is only a bare outline of McLachlan’s elegant proof of the pairing 
theorems; but as the results are so important, and as they show off second 
quantization at its best, the reader is strongly recommended to fill in the 
details for himself. The great virtue of the approach is that it shows the under¬ 
lying connection between the Hamiltonian of a positive ion and that of the 
negative ion. Thus the pairing is maintained even if there is extensive con¬ 
figurational interaction; nowhere in the proof is it assumed that the ground 
state, or any other state, can be represented as a single determinant of mole¬ 
cular spinorbitals. Indeed, neither molecular orbitals nor density matrices 
are mentioned or implicitly invoked at any point in the argument. 

Atomic spinorbital operators can be used for defining the population of an 
atomic orbital, and the bond order between two orbitals, even when the 
molecular wave function is not represented as a single determinant. For 
simplicity we shall neglect atomic orbital overlap. Let us associate the atomic 
spinorbital operators <p+ in pairs, to obtain two-component operators, 

97 = (a r + , fr), = 

where the first component puts an electron into the orbital (p r with spin up, 
and the second puts an electron into the same orbital with spin down. The 
population of q> r is then represented by the operator 

«r = <Pr 97 = a 7 «r" + P7 K = K + 
and its expectation value is 

<lr = <97 • 97 > = frr 
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where y sr is the “spinless” one-particle density matrix in the representation 
based on the orbitals cp r . 

The off-diagonal elements of y sr are also of interest, as may be seen by 
evaluating y sr for a single determinant of molecular spinorbitals. Writing a 
typical molecular orbital in the form 

= Z C Jr<Pr 

we find that 

7sr = (jPr *Ps y = Z Cj r Cks(^j It ^ = Z ^j^jr^js Prs> 

jk j 

which is the Coulson bond order between the atomic orbitals cp r and cp s . If 
y sr is complex, p rs is its real part. So even when the state of a molecule is not 
represented by a single determinant, it seems natural to define the bond order 
between the orbitals (p r and (p s as the expectation value, for that state, of the 
operator (p+q>r • It is, of course, difficult to make the concept of bond order 
fully objective, because the concept of an optimal basis set of atomic orbitals 
is fraught with difficulties; but at least the bond order between two orbitals 
can be defined for any state whatever, once the orbitals have been exactly 
specified. 


VI. Spin Populations 

In the preceding section we introduced for each atomic orbital a two- 
component creation operator (and a two-component annihilation operator) 
of the type 

vt = («;, ft), v; = (£-) • 

These operators may be used for defining in a simple manner the spin popula¬ 
tion in the atomic orbital cp r . Since spin is a vector (more exactly, a pseudo¬ 
vector) with three components, the spin population of (p r is also a vector. 
Its three components are as follows: 

(p x ) r = (pf • a, • (pr , C Py)r = <Pr ' ^^ > iPX = ^ z * <P7 , 

where a x ,a y , and a z are the three spin matrices of Pauli. Thus the z component 
of the spin population of (p r is 

(px =(«;, /o(J _?)(£) = k - m 

in line with the usual elementary definition. 
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VII. Local Creation and Annihilation Operators 


We now turn to creation and annihilation operators which refer to points 
rather than orbitals of finite extent. 

The local operators ^ + (r') and where r‘ indicates a possible position 

and spin orientation for an electron, may be formally defined in terms of any 
complete set of spinorbitals <p v and their associated operators, as follows: 

r<y )=£«-:<<•>,*, 

V 

</'"(>') = X <p v (r')<p; . 

V 

In these definitions q>*(r') is the complex conjugate of <p v (r'), which is simply 
the value of cp v at r'. 

The commutation relations between the i]/ + and the ij/~ may be obtained 
directly from our earlier results: 

»V). r (<•")} -1 «>,-} 

MV 

= Y,<p;(r-)<p,(0 = 6(r'-r-), 

V 

where S(r' — r ") is the three-dimensional Dirac delta function, and 

{no, r(r")} =o={.no, nr")}. 

(The reader should note that the result of applying \J/ + (r') to the vacuum 
state |) is a non-normalizable entity; but this causes no trouble if the formal¬ 
ism is correctly handled.) 

Having gained some experience with the spinorbital operators <p+ and q>~ 
we can review more quickly the important relations between the operators 
± (r ') and other, physical, operators. First we establish the effects of <A”(r') 
on a Slater determinant and of ij/ + (r ) on its complex conjugate. 

= I <p v (r')<p v ~\(pi> 

V 

= ViMI); 

V 

= <Pi(r')\(p 2 > - (p 2 (r')\(pi), etc., 

and likewise 

<<Pil<AV) = <K(r'), 

<<Pi<p 2 \'J' + (r') = <<p 2 \<P*i(r') - <(pi\(pt(r'), etc., 
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so that, for example. 


Oil</' + 0' , )</' (/•")!<?!> = <ptO')<p 1 0'")> 


<<Pi(p2\'l' + (r'W (r")\<p l <p 2 > = (p*(r')(Pi(r") + <pf(r> 2 (r"), 

with corresponding results for larger determinants. The expectation value of 
\j/ + (r')\j/~(r") will be recognized as the one-particle density matrix 


y(r'V) = <^ + (r')^-(r")>, 


and the expectation value of il/ + (r' l )i/ + (r^)ij/ (r 2 )ip~(r';) is found to be the 
two-particle density matrix 


nrlr' 2 \r[r' 2 ) = <* + (rW>y*-(r2)*-(rI)>. 


As in the previous sections, we can express any one- or two-electron opera¬ 
tor in terms of local creation and annihilation operators. Let us first consider 
some one-electron operators. The easiest way to discover their operator 
equivalents is to work with a single-electron system, namely an electron in 
the spinorbital <p(r '). The electron density at r' is then 




Thus the operator representing the electron density at r' is tp + (r')ij/~(r'), and 
its expectation value is y(r'|r'), the corresponding diagonal element of the 
one-particle density matrix. 

Similarly, for the spinorbital <p the electronic dipole moment has the 
expectation value 


j *V(r’)er>(r') = <<p| J ^ (r>rW(r')‘*i'P>; 


we infer that the. operator equivalent of the dipole moment of a many- 
electron system must be 


M = jij/ + (r')er'\l/-(r')dr'. 


Operators involving the momentum also come out in a very simple form. 
First, the kinetic energy. It expectation value for an electron in (p is 


h*_ 

2m 


J (p*(r')V 2 (p(r') dr' 


2m 
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accordingly the electronic kinetic energy operator for an arbitrary number of 
electrons is 

t = - ^ J<A + (r')V VOO dr' = J ^ + (r0/> VOO dr 1 , 

where p denotes the momentum operator. The momentum itself is 

P = J ^ + (r')pij/~(r') dr'. 

The current density at a point in the molecule can be expressed in a specially 
neat manner using creation and annihilation operators: 

Kr) = ~ W + (r a )pi!/-(r a ) + «/'VW('-")]- 

m 

The expectation values of one-electron operators can all be expressed in 
terms of the one-electron density matrix y(r"\r') ; this is well-known, but for 
completeness we give the expressions for <M> and <T>: 

<M> = j er'y(r'\r') dr', 

<r> = - ^ j J l Vh(r'\OW -r')dr ' dr\ 

The expression for <T> is rather ungainly, and the corresponding expression 
for < J(r )) is still more cumbersome. The density matrix symbolism thus 
begins to lose its attractiveness when one works in a coordinate-spin repre¬ 
sentation, as we are now doing. 

Local creation and annihilation operators also lend themselves readily to 
the definition of spin-dependent quantities. Thus if r denotes a point (with 
no spin assigned to it), one may define two-component operators 

l nr) = * V)J. r(r) = pf-! r 21, 

w (rOJ 

whose invariant scalar product 

i J/ + (r)-\J/~(r) = fi(r“) + fi(r p ) = n(r) 

is an operator representing the total electron density at r. The spin density 
at the point r, which is a vector operator, has the three components 

il/ + (r)c y il/-(r), <A + (r)cr z ^ _ (r)] = p^ + (/>i/r(r),. 

where a x , <j y and cr z are the Pauli spin matrices. To show the economical 
advantages of this notation, let us write down the Fermi contact Hamiltonian 
for the interaction between a magnetic nucleus and the surrounding electrons. 
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It is simply 

^dPdNpN I <5Oi - r N )Srt = y gfig N P N \l/ + (r N )oil/~(r N ).I, 

where #’s and /?’s have their usual meanings and r N is the position of the 
nucleus of spin I. A sum over all the electrons has thus been replaced by a 
simple expression involving only one-particle creation and annihilation 
operators. In each expression, of course, the final dot indicates a scalar product 
of two three-dimensional vectors. Later we shall show how this symbolism 
may be used to establish the isotropy of the nuclear spin-spin coupling con¬ 
stants which are determined by high-resolution nuclear magnetic resonance. 

Two-electron operators can also be expressed compactly in terms of the 
local operators \f/ + (r') and «/'“(>')• For example, the coulomb repulsion energy 
of all the electrons is represented by the operator 

G = i f f <A + 0i')<A + ('' 2) j fTZfJi dr l dr±, 

where each integration is over all space and both spin orientations. As always, 
the order of the factors is essential. Likewise, the dipolar interaction Hamil¬ 
tonian between the magnetic moments of a group of electrons takes the form 

K gP) 2 j jdri ^ 2 4' + (rX)'l' + (r^~(rf)4'~(rl) 

IT a KV -o^ 3K v 7 12 )(^7 12 )1 
x 7 -T3- ~5 - ’ 

4 L r 12 r n J 

in which the Greek superscripts indicate alternative spin orientations, and 
we sum over them all. 

Expectation values of two-electron operators may be expressed in terms of 
the two-particle density matrix r(r'f 2 \r[r 2 ) defined above. The complete 
two-particle density matrix is, however, seldom of interest, since the most 
important two-particle expectation values depend only upon its diagonal 
elements. 


VIII. An Application 

To illustrate the use of local creation and annihilation operators in mole¬ 
cular problems, let us use them to demonstrate a result which has been estab¬ 
lished by more elementary methods, but less briefly. The result concerns the 
electron-mediated coupling between two nuclear spins in a closed-shell mole¬ 
cule; we shall demonstrate that the coupling tensor is isotropic. 

The perturbation responsible for the coupling is 2 + 2 , where is 
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the Fermi contact interaction at nucleus 1, namely 

and 2 is similarly defined. The coupling energy between the two nuclei 
is a second-order perturbation energy, having the value 

Il l) J lm I ( m 2) = 2 Re <0|Jf»(£<> “ E n )~\n\ye 2 \0y\, 

where the sum is over all the excited states, and the subscripts / and m indicate 
vector components in three-dimensional space. Using Greek superscripts to 
indicate the orientation of an electron spin, we may thus write the coupling 
constant as 

Jim = ffi&2K%n/3)gPpN] 2 

x 2 Re £ <0|iA + (r>rV~(r>>(£o - 

n> 0 

where the summation convention has been adopted for repeated superscripts. 
We now observe that if any term in this sum is not to vanish there must be a 
relation between the four Greek superscripts. The first two operators alter 
the M s value of the ground state from 0 to X — k (we may think of these 
letters as having values ±|), and the last two alter it to n — v. To connect the 
ground state with the same excited state (which we may take to be an eigen¬ 
state of S.) the pairs of operators must either satisfy k' = v, 1 = or k=1, 
H = v. Therefore 

Re £ m\'W-('>>(Eo-E.)-‘O,W*(rW(r- 2 )\0) 

n> 0 

= y45’ cv 5 a " + S5 kA 5" v . 

The coupling constant therefore takes the form 

J im = CcfcZ(A5 KV 5 x>i + B5 KA <5" V ) 

= C(Acfc x : + BayoZ), 
which may be written in the alternative form 

Jim = C(A trace op m + B trace a, trace <r m ). 

But the Pauli matrices have the property that 

trace a l a m = 5 lm and trace < 7 , = 0. 

It follows that the coupling tensor J lm is a multiple of the unit tensor S lm , 
and hence that the electron-mediated interaction between the two nuclei is an 
isotropic interaction, which may be written in the simpler form 

y/d)./(2) with J = CA. 

This is what we set out to prove. 
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IX. Concluding Remarks 

The possible applications of second quantization methods in theoretical 
chemistry are multitudinous. As we have seen there is a close relation between 
creation and annihilation operators, on the one hand, and the density matrices 
which one would so like to be able to calculate; the latter are simply expecta¬ 
tion values of combinations of the former. Thus the theory of second quantiza¬ 
tion includes the theory of density matrices as a special case; problems which 
can be solved by density matrix methods can also be solved by second quanti¬ 
zation. The converse is not true, however, because second quantization deals 
with physical operators themselves, rather than their expectation values. 

Already second quantization methods are being effectively applied to the 
motion of excitons and charge carriers in molecular crystals, and the forces 
between large “metallic” molecules. Creation and annihilation operators in 
molecular quantum mechanics are here to stay. 
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I. Introduction 

One of the most fundamental and practical problems in the quantum 
mechanics of atoms, molecules, and crystals is to solve secular equations of 
the form 

(H — £S)c = 0 (1) 

where H and S are the matrix representatives of the Hamiltonian and 
unity in some basis set (f>, and c is a column vector of coefficients. It is well 
known that if the system of interest possesses symmetry, a knowledge of the 
irreducible representations of the symmetry group can be used to factorize 
Eq. (1) into a set of secular equations of lower order. 

Professor Slater has been closely associated with this problem, particularly 
in the field of atoms. One of his major contributions was the theory of the 
central-field approximation for atoms (Slater, 1929), which leads to secular 
equations of type (1). The striking feature of his method was that he showed 
how to factorize and solve the secular equations without the use of group 
theory. Slater’s method works well in those cases in which the irreducible 
representations occur not more than once in the basis and the degeneracies 

* This research was supported by the following grant: National Aeronautics and Space 
Administration Grant NsG-275-62. 
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are low. When this is no longer true, he acknowledges the value of group 
theory techniques (Slater, 1960). 

However, the group theory methods described in the literature have an 
unnecessary weakness. In the general case in which degenerate irreducible 
representations occur more than once, they require actual matrix realizations 
of the representations. Yet the factorized secular equations are independent 
of any particular matrixes, and depend only on the invariant characteristics, 
all of which are contained in the character table of the group. 

The object of this paper is to derive the explicitly invariant forms of the 
factor equations. It gives me great pleasure to dedicate it to Professor Slater, 
who has contributed so much to the problem of factorizing secular equations. 

II. Description of Problem 

Let the basis consist of n linearly independent functions <j} ly <f> 2 , ..., 0 n . 
The matrixes H and S occurring in Eq. (1) are defined by 

H = <4>\df<t>} and S = {<j>\ <j>), (2) 

where the row vector 0 = (</>,, <p 2 , •••> $„)• Let G be the symmetry group of 
order g, and with h classes, associated with the system of interest. By definition 
the Hamiltonian commutes with all the symmetry operators R belonging to 
the group, 

(3) 

It will be assumed that the space of the functions (f> is closed under the 
operations of the group, so that <p forms the basis for a matrix representation 
T, in general reducible, of G. The matrices T(/?) are defined in the usual 
way by 

®4>=<j>T{R). (4) 

The completely reduced form of T may be written formally as 

r=£m„r<*>, (5) 

2=1 

where T (t,) is the ath irreducible representation of dimension l tt . 

The essential step in solving Eq. (1) is to find the n roots (eigenvalues) of 
the determinantal equation 

det(H — ES) = 0. (6) 

Equation (1) may then be solved for the corresponding eigenvectors c. The 
structure of T implies that there are m a distinct eigenvalues Ef a) (i= 1 , 2 ,..., m a ), 
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each / a -fold degenerate, belonging to the representation r (ct) ; the total number 
of eigenvalues is 

« = Z m Ja • ( 7 ) 

a— 1 

It follows that a transformation matrix U exists, depending only on the pro¬ 
perties of G, which will reduce the determinant of Eq. (6) to a block diagonal 
form* which can be factored in the following manner: 

det(UHU t - E’USU 1 ') = fl {det(H a - ES a )} l «, (8) 

a = l 

where the matrices H a and S a are of order m a x m a . The roots of Eq. (6) are 
unaltered by the transformation, and the m a eigenvalues E- a) are therefore 
given by the factor determinantal equation 

det(H a — ESJ = 0. (9) 

The aim of the present paper is to construct irreducible factor equations (9) 
involving only the characters x (ct) of the irreducible representations T (a) of G. 
It is not feasible in general to determine a transformation matrix U which will 
do the job directly. The first step is to use the well-known procedure of pro¬ 
jecting symmetry adapted functions 0 (ct) out of the basis 4>. The number of 
sets of such functions required is equal to the number of generators p in the 
basis. From the new (redundant) basis <I>, matrixes H (a) and S (a) of order 
pl a x pl a and rank m a are constructed; these involve the irreducible matrixes 
r (a) . The final step is to sum over the appropriate principal minors of order 
m a of H (a) - ES (a) , which yields the explicitly invariant form of Eq. (9). 

III. Factorization with Simple Basis 

Consider first the simplest case in which the space of the basis 4> can be 
generated by the action of the g symmetry operators $ of the group on one 
member, say <j> t . This is only possible if n^g, or more precisely m a ^l a . A 
new basis, G> (1) , <J> (2) , ..., <t> (A) of g symmetry functions may then be defined 
byt 

^ a) =^ _1 Z rjft/w,, (10) 

R 

where the sum is over all elements R of G. The significance of the <J>’s follows 
from the orthogonality theorem for irreducible representations (Wigner, 1959), 

* Note that U is not the same as the matrix which reduces the matrixes of T simul¬ 
taneously to block form (5). 

f The usual definition involves the complex imaginary of the matrix element T, which is 
a nuisance in the present work. 
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which leads to the result 

<4>i;>, mfy I rg’WA., (> 0 

R 

where Q> is any operator which commutes with the operators of G, and 

D r = (4>»®&4>i\ ( 12 > 

Let the members <$>“ k of the new basis be ordered lexicographically, first by 
the representation superscript a, then by the row suffix /, and finally by the 
column suffix k. Then it follows from Eq. (11) that the new gxg matrix 
representative of 2> = — E, which commutes with G, will have a block form 

of the kind illustrated in Fig. 1. The l a blocks D (o[) = H (a) — £S (ot) of order 



Fig. 1. Block form of D =H — £S in the symmetry basis O for a group of order 15 
with 4 irreducible representations with dimensions 1, 1,2, and 3. Nonzero matrix elements 
are shaded. 

I a x I a , belonging to representation T (a) , are identical, since the expression on 
the right-hand side of Eq.(l 1) is independent of / for i=j. The block submatrix 
D (a) may be conveniently defined by summing Eq. (11) over / and j to give* 

D<«> = <<D(“), (^f - £)<p<*> > = g ~ 1 £ T (a \R)D R (13) 

R 

where 

D r = Hk — ES r . (14) 

The matrix D (a) of order l a x l a is of rank m a . This follows from the fact that 
the lj functions span a subspace of dimension m a l a , belonging to T (a) , of 

* The notation in Eq. (13) requires that the adjoint be taken of a matrix in the left half 
of a bracket expression. This definition of D (a) is 4 times that occurring in Fig. 1. 
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the ^-dimensional space of the basis T. Since functions belonging to different 
rows of <I> (a) are orthogonal, only m a functions in any row are linearly inde¬ 
pendent. Nevertheless, the m a roots £',f a) could be obtained from the deter- 
minantal equation 


det(D (a) ) = det(H (a) - £S (a) ) = 0. 


(15) 


This has two disadvantages, however. In the first place, if m a < l a Eq. (15) will 
have l a —m a irrelevant zero roots. Secondly, to construct D (a) it is necessary 
to have particular realizations of the r (a) (i?). These disadvantages are removed 
in the next section. 


IV. Invariant Factor Equations 


Possible forms for the irreducible factor equations (9) are obtained by 
equating to zero any nonvanishing minor of D (a) of order m a . However, such 
forms contain the elements of the irreducible matrixes r (a) (i?) explicitly. A 
form involving only the group characters can be obtained by equating to zero 
the sum of all the principal minors of D (a) of order m a . Since D (a) is hermitian, 
at least one of the principal minors must be nonvanishing (m a ^0). Further¬ 
more, the nonvanishing minors must be proportional to each other, since 
they all yield the same roots. 

For convenience in deriving the explicitly invariant form, the representation 
number a will be dropped everywhere temporarily. The matrix D (a) given 
by Eq. (13) will therefore be written, ignoring the factor g~ 1 , 


D = £ T(R)D r . 


(16) 


R 


A typical principal minor of D of order m is 



E DrTj,(R) E £> s r, v « 

R R 


(17) 


E 7>„r 4 ,w 


E £*!■„(*) 


R 


R 


By the rule for addition of determinants this may be written 



( 18 ) 


r ft( <*) - 


- r fcfc (iO 
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The sum of the principal minors is 


EE -E= EE-E d r°q - W, Q, .... K) 

i>J> >k = 1 R Q k 


( 19 ) 


where 


i i i 


x m (R, e-I 

i=l i=l A: = 1 


r»(i?) r y (i?) - r ik (i?) 

r , f (0 r jj(Q) . 


r ki (K) 


... T kk (K ) 


( 20 ) 


The importance of the coefficients X defined above lies in the fact that they 
can be expressed directly in terms of the characters x(#), x(i?0, etc -> °f the 
ath irreducible representation. That this is possible can be seen immediately 
by comparing a typical term from the determinant of Eq. (20) with the formulas 
for the characters: 

*W-Er ,ar), 

i 

i j 

x(rqk) = E E E etc. 

i j k 

Every term in the compound character X m , as it may be called, corresponds to 
a permutation belonging to the symmetric group of degree m. Therefore 

*»(*i, Ri, RJ = (m!)- 1 E ± TpiR,, R 2 , R m ), (21) 

p 

where the summation is over all m! permutations, and the + or — sign is 
taken according to whether P is even or odd. The correspondence between 
the permutations P and the individual terms of Eq. (21) is illustrated by the 
following example: if m = 5 and P = (1)(3)(254), then 


T P (Ru *2)=X(*iM*3)*(*2*3*4); 

the form of T P in the general case is clear from this example. The first three 
compound characters are 


X(R) = X (R), 

X{R, Q) = mRMQ)-i(,RQ)l 

X(R, Q, K) = it x(R)x(Q)x(K) - /(R)X(QK) - x(Q)x(KR) - xOOrXRQ) 

+x(RQK)+xmm 
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The X m are not defined for m> l, the dimension of the representation. Some 
elementary properties are as follows: 

(a) X m (R, Q, ..., K) is symmetric in R, Q, K. 

(b) Yj Q> K) = 0, except for the unit representation. 

R 

(c) X t (R, R, R] = det{T(i?)} = ± 1, since T is unitary. 

The explicitly invariant form of the irreducible factor equations can be 
obtained by equating (19) to zero. By substituting for D R from Eq. (14), 
introducing R u R 2 , ..., R m as the element summation symbols and restoring 
the representation number a, Eq. (19) can be written in the polynomial form: 


r — 0 R i R 2 R m 




...,R m ) = 0 
(23) 


For the cases m a = 1,2, and 3, Eq. (23) for the Ej a) has the form 


Y(H r -ES r )X^\R) = 0, 


X X (H r Hq - 2EH r S r + E 2 S R S Q )X (a KR, Q) = 0, (24) 

R Q 

YYY {H r HqH k - 3 EH r H q S k + 3 E 2 H r S q S k - E 3 S r S q S k )X^(R, Q, K) = 0, 

R Q K 

where JT (a) (i?, Q) and Q, K) are given by Eq. (22). 


V. General Basis 

Consider now the general case in which the basis (p possesses p generators, 
say, (px, 4> 2 , •••> <Pp- That is > the SP functions R<p u R<p 2 , ...,R(p p where R 
ranges over the group G, span the n-dimensional space of the basis. The 
functions produced from the generators may be linearly dependent (n^gp). 
Let the subbasis M </>, consisting of the g functions R(pfR c G), be of rank 
so that 

n— Y v ' m - 
M =1 

The p form the basis for a representation M T of the group of order M m, in 
general reducible. Let 

"T = Y»m a r (a) . 

a= 1 


( 25 ) 
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Then it follows from Eq. (5) that 

m a = £ "m, and "m = £ p m a l a . (26) 

Ai = 1 «=1 

To factorize the secular equation (1) in the general case it is necessary to 
introduce a set of g symmetry functions ''d> = ''0 (1) , ''<I> (2) , ..., ''<t> (,l) for each 
subbasis p (j), defined by 


"0> (a) = g ~ 1 £ r (a) CR) . (27) 

R 

Let the new basis of gp functions be ordered lexicographically by a, p, i, k. 

The matrix representative of the operator — E will consist of l a diagonal 

blocks D (a) for each representation T (a) , as in the simple case of Section III. 
However, the D (a) are now of order pl a x pl a , and consist of submatrixes 


D (a) = 


(28) 


The submatrices may be defined, by analogy with Eq. (13), by 

^ v D (a) = <"<j) (a) , ^ v O (o[) >, 

=g- 1 Y T(a XRY v »R> (29) 

R 

where 


' v Z>k=<^,£^ v >, 


= pv H R -E pv S R . (30) 

By introducing new pxp matrixes D R , whose elements are the pv D R , the 
matrix D (at) may be defined succinctly by 

DW=Xr w (i?)xD R , (31) 

R 

where x indicates a direct product. 

The required form of the irreducible factor equations is obtained by taking 
the sum of certain of the principal minors of D (o[) of order m a . These minors 
must contain p m a rows and columns from the submatrix w D (a) of rank p m a , 
as any nonvanishing minor of D (a) of order m a must consist of linearly inde¬ 
pendent rows and columns. Let 2 m, ..., p m ) be a principal minor 

of D (a) of order m = 1 m+ 2 m-\ - Y p m which satisfies the above con¬ 

dition. By the rule for the addition of determinants, it can be written 
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m, p m) = 


zz-z 

R Q K 


n D R T ii (R) 11 D R T ij (R) - 
11 n r..(o) 11 .to) ■■■ 

... 

- lp D R T ik (R ) 













pl D K T ki (K ) . 

... 



, (32) 


where the representation superscript a has been dropped from r. The deter¬ 
minant in Eq. (32) differs from that in the corresponding Eq. (18) of the 
previous section, in that it is not possible to factor the MV ZVs out of it. An 
explicitly invariant form of the irreducible factor equation is still obtained by 
taking the sum over all such minors, but in this general case must be left in 
the form* 


/ / i 


Z Z ••• = 

i~1j = 1 k=l 


(33) 


It can be seen, however, that the coefficient of any product of the s 
will be directly expressible in terms of the simple characters % (ot) ; the compound 
characters X^ do not appear in general. The invariant form of the factor 
equations can be illustrated best by means of simple examples in which the 
basis f possesses p = 2 generators, (f) 1 and <^ 2 - 

(a) Consider the case 1 m = 2 m= 1. The general principal minor of D (a) of 
order 2 is 


m ,/ i.D= EE 

R Q 


n D R r n (R) 

, 2 D R r u (R) 

21 r> Q r Jt (Q) 

22 D a r n (Q) 


Taking the sum of the principal minors, Eq. (33) is 

££ [ 11 D R 22 D QX (R)x(Q)- i2 D R 21 D Q x(RQ))^0. (34) 

R Q 


* Equation (33) is actually ('m! 2 m\ ... •’ml) times the sum of the appropriate principal 
minors of D (I °. 
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(b) Consider the case 1 ra= 1, 2 m = 2. The principal minor is 


M m (i,2)= £££ 

R Q K 

11 D r T ii(R) 

l 2 D R r,j(R) l 2 D R r lt (R) 


n D Q r i: (Q) 

22 v a r ,fQ) 22 D a r Jt (Q) 


21 v k r „(0 

22 D k T tj (K) 22 D k T kk {K) 


Expanding the determinant and taking the sum over all i,j, k this becomes 

III 22 D K { ll D R 22 D Q [ X (Q)x(K)- X m)h(R) 

R Q K 

+ 2 12 D R 21 D Q [ X (RQK)- X (RQ)x(m = 0. (35) 

These equations may be put in the form of polynomials in E by substituting 
for M v D r from Eq. (30). 


VI. Epilogue 

It would seem incredible if the mathematical problem solved in this paper 
had not been tackled and solved at least fifty years ago by the mathematicians 
of group representation theory. However, a reasonably diligent search of the 
literature, and much questioning of mathematicians, has not yet brought such 
a discussion to light. Rather than engage in further historical research, it 
seemed more sensible to publish the author’s treatment of the problem within 
the context of quantum mechanics. 
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I. Introduction 

The techniques for performing ab initio quantum mechanical calculations 
have now developed to the stage where accurate predictions can be made of 
the interaction between simple atoms. Except for the Born-Oppenheimer 
condition, no assumptions or approximations are made in these calculations. 
The accuracy of the results is limited only by the relative completeness of 
the basis set. Further, meaningful bounds can often be put on the results. 
Professor Slater, to whom this volume is dedicated, has played a leading 
role in the development of the technique of ab initio quantum mechanical 
prediction. 

The accurate prediction of an atom-atom interaction is often of consider¬ 
able interest to the experimentalist. His experiment does not directly measure 
the interaction, yet the interpretation of his results often requires a knowledge 
of the interaction. Without this knowledge, he may be unable to judge the 
validity of his interpretation of the results. Further, his experiments often 
cannot decide among a number of different interactions, let alone determine 
the interactions with accuracy. The ab initioist can assist the experimentalist 
in distinguishing between possible interactions and, less frequently, delineate 
the interactions in detail. 

* Supported by the Robert A. Welch Foundation, Houston, Texas, and by the National 
Aeronautical and Space Administration. 
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In this paper, we review the history of the 3 E* interaction* between 
He(ls2s, 3 S) and He(ls 2 , ‘S'). This interaction provides an ideal case history 
of the interplay between the experimentalist and the ab initioist. A wide 
variety of experiments have been used for the experimental study of this 
interaction; spectroscopy, scattering, gaseous electronics and isotope ex¬ 
change. The important issue in our discussion is the existence of a maximum 
in the interaction. The existence of a maximum in this system is not intuitively 
obvious, but experiment and theory have combined to establish its existence 
and size beyond reasonable doubt. Included in this discussion are the latest, 
and heretofore unpublished, calculations performed on this system by the 
Molecular Physics Group. In our discussion we will have need to refer to the 
following states of He 2 and He 2 : 

2 i; = He 2 + [He + (ls) 2 S, He(ls 2 )‘S; 2 Z/] 

‘Z + He 2 [Hells 2 ) 1 S’, He(ls 2 )‘S; % + ] 

‘Z u + = He 2 [He(ls2s)‘.S, He(ls 2 )‘S; ‘Z u + ] 

3 Z U + = He 2 [He(ls2s) 3 S, He(ls 2 )‘S; 3 Z U + ] 

To place the problem in proper perspective, the energies of these states are 
plotted as a function of internuclear distance in Fig. 1. These interactions have 
been computed at various times by the Molecular Physics Group of the 
University of Texas. 


II. The Evidence 

The earliest suggestion of a maximum in a He 2 state was made in 1935 
by Nickerson. He attributed the diffuse 600 A band in a helium discharge to 
the ‘E„ —► 1 Eg transition of He 2 . Since the highest frequencies in the band 
were greater than the differences in the separated atom energies, it was inferred 
that there existed a maximum in the ‘E* interaction. 

In 1952, Buckingham and Dalgarno made ab initio quantum mechanical 
calculations on the interactions of He(ls 2 , ‘S') with He(ls2s, ‘S) and with 
He(ls2s, 3 S). These authors used a single term Heitler-London wave function 
with unoptimized orbital exponents. The orbitals employed were the hydro- 
genic Is and 2s functions of Morse et al. (1935). In the ‘E* and 3 Zj states 
they found maxima at about 4 a 0 with heights of 0.26 and 0.29 eV respectively. 
Both of these states were also found to have binding minima at about 2.1 a 0 . 
This classic calculation supported Nickerson’s conjecture about the ‘Zj 
state of He 2 and, in addition, suggested the existence of a maximum in the 

* The state 3 Z + s He 2 [He(ls2s) 3 S, He(ls 2 )'S; 3 Z + ] which also results from the inter¬ 
action of He(ls2s; 3 S) with He(ls 2 ,‘S), is not considered here. However, see Mulliken (1965), 
Browne (1965) and Buckingham and Dalgarno (1952). 
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Fig. 1 . Ab Initio computed potential energy curves for various states of He 2 and He^. 
Computations done by Molecular Physics Group, University of Texas. He 2 , (Reagan, 
Browne, and Matsen, 1963); He 2 , 1 E. + (Scott, Greenawalt, Browne, and Matsen, 1966); 
He 2 , 3 S U + (Rodriguez, Browne, and Matsen, 1965). 

3 I U + state of He 2 . This latter system is of particular interest experimentally 
because one of the separated atom states, He(ls2s; 3 >S), is metastable with 
natural lifetime estimated to be 10 5 seconds (Mathis, 1957). Such a long life¬ 
time permits a direct study of the metastable atom and its^reactions. 

The first experimental evidence of a maximum in the 3 S„ interaction came 
from a study of Molnar and Phelps (1953), who used optical techniques in an 
electrical discharge and showed that the destruction of the 3 S' helium atoms 
at low metastable densities was by diffusion to the walls at low helium pressure 
and by three-body molecular formation 

He( 1 S') + He(‘5) + He*( 3 S) -+ He( x S) + He 2 ( 3 I u + ) 

at higher helium pressures. Their studies of the diffusion coefficient and 
molecular formation at 300° and 77°K suggested that the interaction had a 
maximum with a height of the order of 0.03 eV. The authors also suggested 










136 


F. A. MATSEN AND D. R. SCOTT 


that a maximum existed in the interaction between a normal Ne atom and a 
3 P metastable Ne atom with a height greater than 0.05 eV. This seems to provide 
a lower limit for the 3 I+ maximum since the observation of the three-body 
collision coefficient of neon (Phelps, 1959) indicates that the height of the 
maximum for Ne is less than that for helium. 

Burhop (1954) obtained a maximum of 0.12 eV at 4.2 a 0 in the 3 Z„ + 
interaction by applying scattering theory to the data of Phelps and Molnar. 

A later investigation (Phelps, 1955) of the diffusion coefficients for 
He(ls2s, 'S') and He(ls2s, 3 S) in normal helium by optical methods gave 
additional experimental support for the potential maximum in the 3 I* 
curve of He 2 . These results showed that the diffusion coefficients for the 
He(ls2s, *5) and He (ls2s, 3 S ) atoms were almost equal at 300°K with an 
indication that the former diffusion coefficient was slightly larger. 

At this point in the case history, an ab initio calculation had predicted the 
existence of a maximum in the 3 E U + interaction, and several lines of experi¬ 
mental evidence tended to substantiate the prediction. However, the interpre¬ 
tation of the results were not unambiguous and the experiments not definitive. 
Further, the possibility existed that the ab initio calculation made an incorrect 
prediction.* For example, the calculations could have contained numerical 
errors, or the predictions could be invalid because of the choice and the size 
of the basis set. (The authors’ choice of a hydrogenic 2s function gave rise 
to the possibility that the maximum was an artifact due to the presence of 
the node in that function.) Further, the Buckingham and Dalgarno results lay 
2.25 eV above the sum of the experimental separated atom energies. With 
such an error, the probability is high that the fine features of the interaction 
would not be reproduced with accuracy. 

This problem is relevant to a broad spectrum of experiments; it is an 
important interaction. Further, as a four-electron two-center problem it was 
susceptible to the same type of calculational techniques used in previous 
investigations of lithium hydride.f For these reasons, the Molecular Physics 
Group initiated a series of ab initio valence bond calculations with the aim of 
providing the most accurate predictions possible for this interaction. 

The first of these (Brigman et al. (1961)) was a single configuration varia¬ 
tion calculation with a basis set of Slater orbitals and was carried out on an 
IBM 650. The important features of this calculation are the optimization of 
orbital exponents at each internuclear separation, the use of an open shell 
He(ls, Is' ’S')function, and a nodeless 2s function in He(ls2s, 3 S). The calcu¬ 
lated separated atoms energy lay 1.358 eV below that of Buckingham and 

* In fact, the calculations of Buckingham and Dalgarno predicted that the He 2 states 
of 1 2 + and 3 2 + were not bound. Both of these states have been found to possess binding 
minima in later calculations (Browne, 1965). 

t For the latest paper in this series, see Browne and Matsen (1964). 
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Dalgarno. The calculated interaction showed a maximum of 0.19 eV at 4.5 a 0 . 
Thus, a second ab initio calculation exhibited a maximum. 

However, the height of the maximum decreased from 0.29 eV in the Buck¬ 
ingham and Dalgarno calculation to 0.19 eV in the latter study. This suggested 
the possibility that more refined calculations would reduce or even eradicate 
the maximum. To test this possibility, a new calculation was made on a larger 
computer (CDC 1604) with a larger basis set (Matsen and Poshusta, 1963). 
This calculation employed a twelve-term valence bond wave function con¬ 
structed from Is, 2s, 3s, 2p, and 3d Slater orbitals. The orbital exponents were 
determined from partial optimization and from helium atom calculations. 
A calculated maximum of 0.14 eV was found at 4.7 a 0 . Thus, a third ab 
initio calculation exhibited a maximum. However, the computed height of 
the maximum again decreased from the earlier calculated values. 

In the meantime, experimental evidence for a maximum in the closely 
related He 2 ^ state appeared in a spectroscopic reinvestigation of the 600 
A band of helium (Tanaka and Yoshino, 1963). The authors concluded that 
the 600 A spectrum arises from transitions ‘1+ to ^ and that the 1 2^ 
state possesses a maximum. More recent evidence for the maximum in the 
curve of the 3 I U + state of He 2 has been obtained from a study of the scattering 
of metastable helium atoms by normal helium atoms (Muschlitz and Richards, 
1964). It was found that the scattering of the triplet atoms was larger than that 
of the singlet atoms. This was attributed to the presence of a larger maximum 
in the 3 I U + interaction. 

The most direct evidence for the potential maximum in the 3 2 u + curve of 
He 2 should come from a study of the temperature dependence of collision 
processes. In the case of interaction of two atoms with no potential maximum 
in their interaction curve, it is expected that the cross sections will increase 
with decreasing temperature with a maximum cross section near absolute zero. 
However, for an interaction curve having a maximum, the cross section should 
decrease rapidly at low temperatures, approaching zero as the temperature 
decreases to absolute zero. The temperature variation of the exchange cross 
section for metastability of He* with He (Colegrove et al., 1964) was expressly 
undertaken to experimentally test for the existence of a maximum in the 
3 I„ + interaction. The experimental results, obtained by optical pumping 
techniques, definitely show that the cross section decreases with decreasing 
temperature, approaching zero as the temperature approached absolute zero. 

Shortly before the publication of these important experimental papers, the 
Molecular Physics Group had begun another ab initio calculation on the 
3 I + state of He 2 using a mixed basis set of elliptical and Slater orbitals of 
the type developed by Browne and Matsen (1964). The elliptical orbitals were 
introduced in order to better represent polarization effects in the region of 
the maximum. The details of this computation are given in the appendix. The 
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computed potential energy curve from this study as well as those of previous 
investigations are illustrated in Figs. 2 and 3. A maximum in the 3 £ u + curve 



R(o 0 ) 

Fig. 2. Successively improved ab initio curves for the interaction He(ls 2 , *S) + He(ls2s, 
3 S). Texas, 1961 (Brigman, Brient, and Matsen, 1961); Texas, 1963 (Matsen and 
Poshusta, 1963); Texas, 1965 (Matsen and Scott, 1965). 

of 0.16 eV was found at 4.7 a 0 . The difference between the calculated separated 
atom energies and the experimental value was only 0.17 eV, and a rigorous 
upper bound of 0.33 eV for the maximum was established. Thus, a fourth ab 
initio calculation exhibited a maximum. Of particular importance was the fact 
that the height of the maximum had stabilized between the third and fourth 
calculation and that refinement of the calculation did not necessarily decrease 
the barrier height. This computation lies at the limit of theCDC 1604 computer. 

A similar ab initio computation on the closely related X T U + state of He 2 
has also recently been completed (Scott et al., 1966). This is of interest to the 
discussion of the curve since the X I* state arises from similar atomic 
states, He( 1 s 2 , 1 S) and He( 1 s2s, 1 S) and because the calculations of Buckingham 
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Fig. 3. Ab initio curves for He(ls 2 , l S) + He(ls2s, 1 S), (Scott, Greenawalt, Browne, 
and Matsen, 1965) and forHe(ls 2 , 1 S) + He(ls2s, 3 5) (Matsen and Scott, 1965). 

and Dalgarno indicated a maximum in the state as well as in the 
3 £+ state. Clearly, for the results of the ab initio calculations on the 3 S U + 
curve to be trustworthy, it should be shown that the maxima which are 
predicted by the early computations on the other states of He 2 are verified by 
more refined calculations. Also, there is experimental evidence for the maxi¬ 
mum in the J S U + curve from spectroscopic and scattering studies as mentioned 
previously. The results of the computation using seventeen terms constructed 
from Slater orbitals; Is, 2s, 3s, 2p 0 , 2p + , 2p_, 3p 0 , 3p+, 3p_, 3d 0 ; gave a 
maximum of 0.15 eV at 5.2 a 0 . The height of the maximum is the same as that 
found for the % + curve in the most recent computation, although its location 
is at 0.5 a 0 larger internuclear separation. 

The University of Texas is now contracting for a new computer with which 
a fifth ab initio calculation of this type' will be carried out. With the new 
computer one can employ a larger basis set and more extensive orbital expo¬ 
nent optimization. It is estimated that the results will lie within 0.1 eV of the 
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true one. This calculation should provide a representation of the interaction 
which is accurate enough for the experimentalist. 

III. Conclusions 

The results of the ab initio calculations are summarized in Table 1 and Figs. 
2 and 3. There is definite evidence from all of these computations that a 
maximum does exist in the 3 E+ curve. In the latest and more extended 
calculations the height of the maximum has leveled off to a value of about 
0.15 eV which is probably a good estimate. The location of the broad maxi¬ 
mum is 4.6-4.7 a 0 . The validity of the calculations is supported by the fact 
that a maximum is also found in the curve calculated to the same level 
as the 3 E+ curve. The maximum in the X E+ curve is also supported by 
experiment. 

This theoretical evidence coupled with the experimental studies, particularly 
the study of the temperature dependence of the metastability exchange cross 
section, prove beyond any reasonable doubt that a maximum does exist in the 
3 Z+ curve of He 2 . 

A fact of considerable significance is that no elementary physical basis for 
the maximum in the interaction has yet been proposed. This important feature 
of the interaction is the consequence of subtle dynamic effects which are 
revealed only by the intensive ab initio treatment of the type described here. 

We wish to express our appreciation to Dr. J. C. Browne for his many 
helpful suggestions and criticisms. 


Appendix. Potential Curve for Lowest 3 E+ State of He 2 

The wave functions used in these computations were of the generalized 
valence bond type (Browne and Matsen, 1962). The basis set was composed 
of both Slater orbitals as defined by Browne and Miller (1962) and elliptic 
orbitals defined as follows: 

4>(n, = (2n) l/2 (- exp[-(cd + fin)] 

P 2 - 1)(1 - /x 2 )] |m|/2 exp(/m</>) 

<Xj and fij are the variable orbital exponents, and the elliptic coordinates, 
X and n, are defined by Miller et al. (1959). Harris (1960), Harris and Taylor 
(1963), and Browne (1964) describe the properties of these orbitals. Orbital 
exponents were varied independently at each internuclear separation to 
obtain the lowest energy with the following wave function: 

% = Q[1s^ s W> b ( 1,0,0;2)] + C 2 [ls 2 ls^ B (0,1,0; 2)]. 


TABLE 1 

Summary of Ab Initio Valence Bond Calculations of 3 S u + States of He : 


On the Existence of a Maximum in Interaction 


c 

U O 
P . 

a *2 

a £ 

2 03 


b a> 

03 


.2 ^ 

t > 

ffl 


S O 


cn X3 « 

^ cQC 

I § e%, 

j x> ° w 




° 


p 

ft 

H 


g 

o 


3 

u 

13 

U 


<N 


o 

Tl" 


<N 


*-i ^ 


os 

o 


Tt 1 


<N 

o 


O T3 A 
2 2 'C 

<0 o3 ^ 

GO g 
O * ^ 
u bO+J 

?.§ § 

" ° o 
a 

in' aj 

J2 g« 

„ o-« 

<S 5 •£ 

^ ^ o 
o 

e ° “ 

S g ^ 


oo 

<N 


c a 
p S 
g- g 
O o 
a 
x 

y-N <o 

CO 

<N o3 


on cn 
on rt 


GO CJ 
G G 




x) 

G 

o3 

£ 

o3 

4= O 
GO G 

.2 S 

3 o3 

wfl 


'x> 

o 

u 
P 

3 

II ^ 

£ 

o.S£ 

cn 


o 


rt- 


os 

CO 

CTs 


CO 

cn 


so 

o 


so 

rt 1 


<N 

(N 


O ^ 


s 


os 

cn 


rt oo 
O 


<N 


<N 


.2 S 
* o d 
fie S.2 

3.2«S 

Il”l 

j-h g 5 a 

o .£ 5 a 

3-8.3 

73 C'^X) 

G d O ^ __ 

.2 £ a ^ a3 

« £ S s'S 

,S <u> ^ £ £ 

,s|siifi 

a > 3 _£3 

o H o n 5 



cn 

ST) 

so 

so 

SO 

os 

OS 

OS 


i 





on 

on 

on 

rt 

c3 

c3 

X 

X 

X 

CD 

<u 

<D 

H 

H 

H 


<3 


NO 

X> 

cn 


vO 

a 

^ r 4 

cn on 
SO 1-H 
OS N 
y-4 Cn 


^ ~ 
tn "cn 
G cn 
43 "cn 
g <N 
O N 

pu, ^5 

X 

G v. 

* CN 
G V ry5 
<L> ^ 
cn ^ 


S3 


I 


2/£ 

>/\ + 
^ 1 a 

X3 ^ 
p . „ 
O "cn 
G <N 


O NO 
H a 

5—( f\l 

<U> 04 


<N 


c3 
• 2 
P 
£ 
G 
G 


>4 l/J 

,9 <N 


XJ 
P / 

O \ 

s 

O 

o 

£ 

c3 


<N 


G 

8 

o 

£ 

G 

£ 

'x q 

2 5 

£ p 


o 

„ a 

H <N 

N 

. r, W 


cn (N 


•Sc? 

*G so 
12 


^ (N 

jg CO )s 


j— / a 

a * on 


03 


G 

.2 « 


cn * ^ 
SO "cn 
OS (N 


c3 


■—< Cu rn 
^ m !XJ 


I 


2^ 

o3 

a 'g 

8.1 

5 m 
0) ^ 
13 g" 

G c3 

6 £ 

S GO 


3 /S 

•S ; 

£'s 

X3 • - 

c '« 


£ m 


r « .-O 


s/*?, 

y \ + 7 


XJ 

cn 


141 


Matsen and Scott (1965). Unpublished work. See appendix for details of computations. 
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After obtaining these orbital exponents, additional terms composed of Slater 
orbitals were added to give the twenty-one term wave function below: 

+ C 3 [2p3p-^ls^2s B ] + C^[2s' a 2 1s' b 2s b \ + C 5 [2p^ls^2s B ] 

+ Cg[ 3s^ 2 ls B 2s B ] + C 7 [ls^ls B 2p 0B ] + C 8 [1 s a 2s a 1s b ] 

+ C 9 [3p+ A 3p- A \s' B 2s B ] + C 10 [1s^2s^1s b 2s b ] + C 1 ^[2s /1 3s^ls B 2s B ] 

+ C 12 [ls A 3s^ls B 2s B ] + C 1 3[2p + ^ 4 3p_ y4 ls B 2s B ] + C 1 4[ls^2p 0A ls B ] 

+ C 15 [ls x 2p 0y4 ls B 2s B ] + C 16 [1 s^1s b 2s b ] + C 17 [3po A ls B 2s B ] 

+ Ciq[2pq A 3pq A \s b 2s b ] + C 19 [1s^1s b 3s b ] + C 2 o[ls^2s^42s B ] 

+ C 21 [2s^ 2 ls B 2po B ] 

The carats indicate the pairing between the orbitals. The orbital exponents for 
the added terms were obtained from a similar computation on the 1 T„ 
state (Scott et al., 1965), previous computations on the 3 Z* state (Matsen 
and Poshusta, 1963), a computation on the helium atom (Matsen and Stuart, 
1964), and from numerical experimentation with 'Pi and added terms. The 
orbital exponents are listed in Table 2. The numerical potential energy data 

TABLE 2 

Orbital Exponents for T2 


R 

iflo) 

«(0,1,0) 

M 1 , 0) 

ad, 0, 0) 

0(1, 0, 0) 

Is 

Is' 

1.5 

0.9363 

1.2632 

0.4026 

0.0718 

1.8244 

2.0015 

1.8 

1.6234 

1.2691 

0.4611 

0.1405 

1.7715 

2.0174 

1.9 

0.7510 

1.3300 

0.4625 

0.0654 

1.7527 

2.0135 

1.982 

0.5223 

1.2224 

0.4847 

0.0342 

1.7444 

2.0182 

2.08 

0.6834 

1.1000 

0.5015 

0.0834 

1.7329 

2.0148 

2.13 

0.6766 

1.1600 

0.5106 

0.0934 

1.7264 

2.0136 

2.17 

0.6466 

1.1000 

0.5147 

0.0765 

1.7253 

2.0172 

2.20 

0.7156 

1.100 

0.5228 

0.1229 

1.7206 

2.0144 

2.50 

0.7045 

1.200 

0.5842 

0.2776 

1.7004 

2.0124 

3.00 

0.3152 

1.0800 

0.6861 

0.7147 

1.6838 

2.0040 

3.50 

0.7061 

1.2100 

0.7953 

0.8889 

1.6816 

1.9915 

4.00 

2.4050 

1.6359 

1.0273 

1.1352 

1.6818 

2.0018 

4.50 

2.7079 

1.9032 

1.1874 

1.2679 

1.6834 

2.0014 

5.00 

2.9766 

2.0675 

1.3541 

1.4373 

1.6849 

2.0008 

5.50 

3.1736 

2.3445 

1.5197 

1.5143 

1.6859 

2.0010 

6.00 

3.2089 

2.4711 

1.6988 

1.6240 

1.6864 

2.0008 

7.00 

3.4439 

2.7799 

2.0294 

1.7931 

1.6870 

2.0008 

10.00 

3.5825 

3.4548 

2.8917 

2.4342 

1.6875 

1.9994 

15.00 

6.1008 

5.2296 

4.4623 

3.9095 

1.6647 

2.0009 
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TABLE 2 (Cont.) 


R 

(4o) 

2s 

2s' 

3s 

3s' 

2p 0 , 

2p+,2p_ 

2po 

3p 0 , 

3p+,3p_ 

1.5 

0.4500 

3.5801 

2.175 

1.227 

2.4700 

0.4500 

3.000 

1.8 

0.4600 

2.8582 

2.170 

1.227 

2.4700 

0.4600 

3.000 

1.9 

0.4650 

2.7500 

2.170 

1.227 

2.4750 

0.4650 

3.000 

1.982 

0.4650 

2.6343 

2.170 

1.227 

2.4800 

0.4650 

3.000 

2.08 

0.4700 

2.7000 

2.164 

1.227 

2.4800 

0.4700 

3.000 

2.13 

0.4700 

2.7675 

2.164 

1.227 

2.4800 

0.4700 

3.000 

2.17 

0.4750 

2.7280 

2.164 

1.227 

2.4800 

0.4750 

3.000 

2.20 

0.4800 

2.6888 

2.164 

1.227 

2.4800 

0.4800 

3.000 

2.50 

0.4900 

2.6780 

2.160 

1.227 

2.4850 

0.4900 

3.000 

3.00 

0.5000 

2.5708 

2.160 

1.227 

2.4900 

0.5000 

3.000 

3.50 

0.5200 

2.5562 

2.160 

1.227 

2.4950 

0.5200 

3.000 

4.00 

0.5300 

2.4954 

2.160 

1.227 

2.5000 

0.5300 

3.000 

4.50 

0.5600 

2.5622 

2.159 

1.227 

2.500 

0.5600 

3.000 

5.00 

0.5600 

2.5327 

2.155 

1.227 

2.500 

0.5600 

3.000 

5.50 

0.5600 

2.5584 

2.150 

1.227 

2.500 

0.5600 

3.000 

6.00 

0.5650 

2.5550 

2.150 

1.227 

2.500 

0.5650 

3.000 

7.00 

0.5650 

2.5500 

2.140 

1.227 

2.500 

0.5650 

3.000 

10.00 

0.5700 

2.4479 

2.130 

1.227 

2.5000 

0.5700 

3.000 

15.00 

0.5700 

2.4473 

2.099 

1.227 

2.5000 

0.5700 

3.000 


are collected in Table 3 and illustrated in Figs. 2 and 3. Table 4 contains the 

computed spectroscopic constants. These computations were performed on 

the CDC 1604 at the University of Texas using programs written by the 

Molecular Physics Group. 





TABLE 3 


Potential Energy Data for 

3 2+ State of He 2 from T 2 

R 

-E 

R 

-E 

(au) 

(au) 

(au) 

(au) 

1.50 

5.03724 

3.50 

5.07490 

1.80 

5.11073 

4.00 

5.06787 

1.90 

5.11829 

4.50 

5.06694 

1.982 

5.12202 

5.00 

5.06693 

2.08 

5.12373 

5.50 

5.06807 

2.13 

5.12360 

6.00 

5.06913 

2.17 

5.12279 

7.00 

5.07071 

2.20 

5.12097 

10.00 

5.07234 

2.50 

5.11084 

15.00 

5.07266 

3.00 

5.09031 
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TABLE 4 

Computed Spectroscopic Constants for 



3£ + 

U 

State of He 2 from 

T 2 


Re 

(fit) 

co e 

(cm -1 ) 

OJ e X 

(cm -1 ) 

(cm -1 ) 

Be 

(cm -1 ) 

1.975 

1809.91 

Experimental“ 
38.8 9 

0.243 8 

7.7103 

2.08* 

2051.73 c 

Calculated 

37.17 c 

0.095 c 

6.78 c 


a Ginter, 1965. 

6 Obtained by fitting a cubic through computed points at 1.982,2.08, 2.13, and 2.20 a 0 . 
c Obtained by fitting a cubic by least squares through computed points at 1.5, 1.8, 1.9, 
1.982, 2.08, 2.13, 2.50, and 3.00 a 0 and applying the formulas of Dunham (1932). 


The results of this calculation show a maximum in the curve at 4.6 a 0 with 
a computed height of 0.16 eV. From the experimental separated atom energies 
and the computed energy at an internuclear separation of 4.5 a 0 , a rigorous 
upper bound of 0.33 eV may be set on the height of the maximum. The 
minimum in the curve occurs at R — 2.08 a 0 , and the calculated value for the 
rationalized binding energy, D e , is 1.39 eV. A new rigorous lower bound on 
D e is calculated to be 1.22 eV. From an energy cycle involving the ionization 
energy of He(ls2s; 3 S') and the ionization energy of the 3 I* state of He 2 
(Reagan et al., 1963) a lower limit of 1.76 eV may be set for D e . Therefore, 
the wave function still needs to be improved in the vicinity of the minimum. 
The difference between the sum of the experimental separated atom energies 
and the calculated value at 15 a 0 is only 0.17 eV. The agreement between the 
computed and experimental spectroscopic constants is good. 
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I. Introduction 


This discussion will relate particularly to the approximate solution of the 
Schrodinger equation for the helium atom and other three-particle Coulom- 
bic systems. These problems form the simplest examples of incompletely 
separable Schrodinger equations where recursion formula methods have been 

US Slater (1927) was one of the first to consider the quantum mechanics of the 
He atom. The early results of Hylleraas (1929, 1964) served not only to provide 
a good theoretical ground-state energy of He but also to confirm the essential 
validity of quantum mechanics. Kato (1951a,b) proved the existence of solu¬ 
tions to the problem and Kinoshita (1957, 1959) and especially Pekens (1958, 

1959) produced results of unusual accuracy. 

Recursion formulas in the He problem were first considered by Bartlett 
et al. (1935) and later by Kinoshita (1957), Pekeris (1958) and Munschy and 
Pluvinage (1962). 
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The He calculation has been reviewed by Bethe and Salpeter (1957) and by 
Hylleraas (1964). 

The purpose of the present article is to discuss the relation between recursion 
formula methods and the variation method ordinarily used for He and similar 
problems. 


II. Coordinates and Function Sets 

Restricting the discussion to S states of a general three-particle Coulombic 
system in the nonrelativistic approximation and with neglect of magnetic 
interactions it is well known that the wave function involves three nonsepar- 
able internal coordinates. Two sets of coordinates will be emphasized in this 
discussion; interparticle coordinates and perimetric coordinates. For each 
set of coordinates two or more function sets will be considered. 

A. Coordinates 

1. Interparticle Coordinates r lf r 2 , r l2 . The distances of electrons number 1 
and 2 from the nucleus are r x and r 2 while r l2 is the interelectronic distance. 
If the nucleus or other particle is called particle number 3 a more symmetrical 
notation may be used: 


r i = H 3 


r 2 = >23• 

2. Hylleraas Coordinates s, t, u. For systems with two like particles such as 
two electrons the symmetry of the wave function is most easily handled with 
coordinates defined as: 


s =r l +r 2 
t=~r t + r 2 
u = r 12 . 

3. Kinoshita Coordinates s, p,q. Kinoshita (1957) showed the advantage to 
be gained by defining: 


s = r x + r 2 as before, 
p = u/s 
q = t/u. 

In particular a series using positive powers of p, q, which bring in negative 
powers of 5 and u, can result in a formal solution. 
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4. Perimetric Coordinates u, v, w. These coordinates were originally defined 
by Coolidge and Janies (1937) as: 

u = r 2 + r 12 — r x 

v =r x +r 12 -r 2 

w = r t + r 2 - r 12 . 

Pekeris (1958) made use of similar coordinates differing by a common scale 
factor and a factor of 2 in w: 

w = 2 {r x + r 2 - r 12 ). 

The Kinoshita and perimetric coordinates both have an advantage over 
the other systems mentioned above in that the ranges of the separate coordin¬ 
ates are independent of each other. In interparticle coordinates, for example, 
the triangular condition 

ki - r 2 \ < r 12 <r 1 +r 2 

must be satisfied. 


B. Function Sets 

This discussion will involve, and attempt to interrelate, power series, 
exponential power series, and the series in Laguerre functions as used by 
Pekeris. Other important function sets have been put forth by Fock (1954) 
and by Pluvinage (1955) and Munschy and Pluvinage (1957). 

1. Power Series. The simple power series in interparticle coordinates 

y = I c J>m ,„rir?r; 2 (1) 

Lm,n 

was considered by Bartlett et al., (1935). This series yields the simplest recur¬ 
sion formula and is useful in discussing the possibility of a formal solution. 
Of course it is of no value for a variation method calculation since the separate 
terms do not satisfy boundary conditions. 


2. Exponential Power Series. 


z 

l,m,n 




~Z(ri+r 2 ) r l r m n 
f i f ? r 


V 2' 12* 


( 2 ) 


This series and similar series in the other coordinate systems are particu¬ 
larly useful in variation method calculations. Recursion formulas are easily 
derived. 


3. Laguerre Function Series in Perimetric Coordinates. 
<P = I exp[-(« + v + >.’)/2] 

l,m,n 

This is the series used by Pekeris. 


( 3 ) 
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III. Recursion Formulas 

Bartlett et al., (1935) derived the nine-term recursion formula for the simple 
power series in interparticle coordinates, Eq. (1): 

(/ + 2)(/ + 3 + n)c l + 2<m> n + irn + 2)(m + 3 + n) c l>m + 2>n 
+ (n + 2)(2n + 6-1 - l + m) c t m n+2 — (/ + 2)(« + 2) c l + 2 m - 2<n + 2 

(4) 

(tn + 2){n + 2) Cj- 2 ,m + 2 ,n +2 "h (^/4) G,m,n 

4“ G+l,m,n 4“ G,m+l,n TG,m,n+l ® 

Other published recursion formulas for the He problem are the twelve-term 
formula for the exponential-power series in Kinoshita (1957) coordinates and 
the 33-term formula of Pekeris (1958) for the series of Eq. (3). 

A. The question of formal solutions 

Bartlett et. al., (1935) showed that the recursion formula of Eq. (4) does not 
permit a formal solution. In particular by inserting the sets of values for /, 
m, and n of (—1, 2, —1), (—1 0, 0), and (1, 0, —1), an inconsistent set of 
equations result for the calculation of C 10(1 in terms of C 0 0 0 . In the same 
way it can be shown that power series or exponential-power series in any of 
the coordinate systems 1, 2, or 4 or Section II,A. above do not provide 
formal solutions. 

On the other hand, Kinoshita (1957) proved that the exponential-power 
series in his coordinate system [Eq. (3) of Section II, A] does give the possibility 
of a formal solution. This was accomplished by arranging the coefficients, 
c l>m>n , is such an order that each succeeding coefficient could be calculated in 
terms of the preceding coefficients. That this series forms a formal solution 
whereas the corresponding Hylleraas series does not was explained by noting 
that the former differs from the latter by the inclusion of terms with negative 
exponents on Hylleraas coordinates. 

Whether the Pekeris series, Eq. (3), can or cannot be a formal solution 
seems to be an open question. The 33-term recursion formula is sufficiently 
complicated so that it is not easy to find an inconsistency by the method of 
Bartlett, Gibbons, and Dunn nor does it seem likely to be able to put the 
recursion formula in a form to prove positively that a formal solution exists. 

Although Kinoshita presented and discussed the recursion formula for his 
series he did not make use of it in his numerical calculations. However in the 
Pekeris calculation the recursion formula was a central feature in the numeri¬ 
cal work in that the formula essentially provided all of the matrix components 
for the variation method secular equation without the direct evaluation of any 
integrals. In this way, Pekeris (1959) obtained the remarkable precision often 
significant figures for the He atom energy while using up to 1078 terms in the 
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series. This might lead one to believe that the Pekeris series could be a formal 
solution. This is not necessarily the case since the Pekeris method involves 
solving a finite secular equation obtained by truncating the infinite matrix 
resulting from the application of the recursion formula. Even if a formal 
solution exists the coefficients obtained in the Pekeris calculation will not all 
satisfy the recursion formula. 

B. A PARADOX 

The unusual accuracy of the Pekeris calculation might lead one to surmise 
that his series can be a formal solution. If it be assumed that this is the case 
then the following paradox would seem to exist. The function L,(w), for 
example, is a polynomial of degree / in the variable u. If the Pekeris series, 
Eq. (3), with terms up through a certain maximum degree / + m + n is ex¬ 
panded it will be in the form of an exponential-power series in perimetric 
coordinates with terms to this same highest degree. This finite series in turn 
is equivalent to an exponential-power series in interparticle coordinates to 
this same degree. Up to this given degree each set of basis functions can be 
written in terms of any of the other sets. Therefore since both exponential- 
power series are known to lack formal solutions doubt is cast on the Pekeris 
series as a formal solution. 

The resolution of the paradox, if indeed it actually exists, can be found in 
terms of transformations between basis sets of functions, in particular between 
the set of Laguerre functions of perimetric coordinates and the set of functions 
of the exponential power series in perimetric coordinates. If and 

{( p', m n } are the two basis sets of exponential powers and exponential Laguerre 
functions as used in Eqs. (2) and (3) which can be represented by infinite row 
vectors O and O', respectively, an infinite series wave function \j/ can be 
written as 


\jt = Oc = O'c' (5) 

where c and c' are column vectors formed from the coefficients in the two 
series. 

Let T be a transformation matrix such that 

O' = OT and 0 = 0'T _1 . (6) 

Each column of T contains a finite set of nonzero numbers which are coeffici¬ 
ents in the corresponding product of Laguerre polynomials. Conversely the 
columns of T -1 are coefficients of exponential powers expressed as linear 
combinations of Laguerre polynomials. Both T and T 1 are approximately 
upper triangular matrixes in that zero elements occur if one goes sufficiently 
below the main diagonal. 
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In order for Eq. (5) and Eq. (6) to be satisfied the coefficient vectors must 
transform as 

c' = T _1 c and c = Tc'. (7) 

Because of the upper triangular nature of T _1 and T and the involvement of 
rows of T _1 and T in Eq. (7) instead of columns the elements of c and c' are 
related through infinite rather than finite series. It is uncertain in general 
whether such series converge. If the Pekeris-type infinite series is a formal 
solution there should be a well-defined vector of coefficients c'. But the infinite 
series c = Tc' may diverge or oscillate, therefore the exponential power 
series can be understood to lack a formal solution even though the Pekeris 
series may be a formal solution. 

IV. Matrix Formulation 

In considering the general three-particle problem in which no one particle 
is held fixed it has been found useful to use a matrix formulation (Frost, 
1964). Allowing all three particles to move relative to the center of mass the 
Pekeris series recursion formula would expand to 55 terms. Using a simpler 
exponential-power series in perimetric coordinates yields a 25-term recursion 
formula (Frost et al., 1964a). After first setting up the 25 term recursion form¬ 
ula matrix it was then easily transformed to what would have been obtained 
directly from the 55-term recursion formula. The computer programming of 
the calculation was easier to accomplish by this method than if the 55-term 
formula had been used directly. 

A. Recursion formulas in matrix form 
Consider a general series solution of an arbitrary Schrodinger equation in 


the form 




o 

II 

i 

& 

(8) 

where 




CP 

0- 

II 

(9) 

By substitution 




Z <Pj - E<Pj)Cj = o 

(10) 


& cpj can in principle be expanded as a series in (p { . This series will in general 
be infinite. However if Eq. (10) is multiplied by a carefully chosen function 
g, which for the three-particle problem is 


9 =r 12 r 13 r 23 , 
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the necessary expansions have a finite number of terms: 

gXvi-lLvtH,, (ID 

i 

ff(Pj = Z ( Pi S U ( 12 ) 

i 

The expansion coefficients are elements of matrixes defined as H and S, 
respectively. 

After multiplying by g, Eq. (10) can then be written in matrix form as 

0»(H - £S)c = 0. (13) 

The recursion formula in matrix form follows from the requirement that the 
coefficient of each element of <I> must be zero, therefore, 

(H - ES)c = 0. (14) 

That the recursion formula has a finite number of terms corresponds to the 
matrixes H and S having only a finite number of nonzero elements in each 
column even though the matrixes are infinite. 

Equation (14) amounts to an infinite number of linear equations in an 
infinite number of unknowns cTo get an approximate solution of the prob¬ 
lem it seems necessary to truncate the system of equations to finite size, say 
n by n, and solve the usual secular equation 

|H — ES\ nXn = 0. (15) 

In general this does not seeem possible since the matrixes H and S are not 
necessarily Hermitian or symmetrical and therefore the eigenvalues E are not 
necessarily real. Pekeris was successful because with his choice of basis 
functions which are orthogonal with a weighting factor 1/g the matrixes are 
necessarily symmetrical. It is this truncation of Eq. (14) with symmetrical 
matrixes that causes the result to be equivalent to the use of the variation 
method. 

This matrix formulation of the quantum mechanical problem has a relation 
to the linear algebra treatment by Lowdin (1964). 

B. Transformation of recursion formula matrixes 

By insertion of the product TT" 1 both in front of and behind the parenthesis 
in Eq. (13) and realizing the transformation properties of <!> and c [Eqs. (6) 
and (7)] it follows that the transformation of H and S from a basis to a 
basis <J>' results in 


H' = T -1 HT 
S' = T _1 ST. 


( 16 ) 
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If a T is known or can be determined such that H' and S' are symmetric where 
the original H and S are not, a truncated solution can surely be obtained. It is 
important to note that while for given size matrixes a similarity transforma¬ 
tion such as Eq. (16) does not modify eigenvalues this rule does not apply here 
since the H' and S' are to be truncated after the transformation not before. 
This procedure was used successfully in the work mentioned in the introduc¬ 
tion of Section IV. 

V. Extension of Recursion Formula Methods 

Following Pekeris’ lead it would be desirable for general atomic and mole¬ 
cular problems to be able to choose a reasonably simple set of functions in 
some suitable coordinate system such that matrixes H and S can be calculated 
according to the definitions in Eqs. (11) and (12). The procedure would be to 
find a transformation T such that H' and S' would be symmetrical. 

The first part of this procedure has been accomplished with exponential- 
power functions in interparticle coordinates (Frost et al., 1964b). Unfor¬ 
tunately neither the similarity transformation nor an alternate method of 
symmetrization which was tried has been sufficiently successful. Not all con¬ 
ceivable schemes for symmetrization have been exhausted so there may still 
be a prospect. 

It has occurred to several workers that the original typically unsymmetrical 
matrixes might still yield a practical solution by some limiting process as the 
size of the truncation is increased. So far, only trivial cases where the recursion 
formula has no more than three terms has been successful by this method. 
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I. Introduction 

It was first pointed out by Slater (1951) that solutions of the ordinary 
Hartree-Fock equations must exist in which electrons of different spin, 
m s = ±1/2, are described by significantly different one-electron Hamiltonian 
operators. This is obviously true for a molecule such as H 2 at large internuclear 
separation, and can be expected to be true for antiferromagnetic materials 
(Slater, 1951). An analysis of the Hartree-Fock equations obtained by requir¬ 
ing the energy of a single Slater determinant to be stationary, with no con¬ 
straint other than normalization, shows that for general open-shell configura¬ 
tions the effective one-electron Hamiltonian necessarily has symmetry lower 
than that of the many-electron Hamiltonian (Nesbet, 1955). For atoms, unless 
L = 0, the equations differ for different values of m, and mix different values 
of /, and unless S = 0, the equations depend on m s . The ground states of 
B( 2 P) and Li( 2 S') were discussed in this reference as examples, respectively, of 
reduced orbital and spin symmetry of the one-electron Hamiltonian. The 
theorems of Brillouin (1934) and of Moller and Plesset (1934), which show 
that Hartree-Fock one-electron properties of many-electron systems are 
subject to corrections only of second order and higher in the many-particle 
perturbation theory, are valid only when such properties are calculated in the 
unrestricted Hartree-Fock (UHF) approximation, with a one-electron 
Hamiltonian of reduced symmetry for an open-shell configuration. Conversely, 
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if the variational equations are constrained by a symmetry restriction (not 
allowing different I values to mix) or by an equivalence restriction (using 
the same one-electron Hamiltonian for all mi values or for both m s values), 
one-electron properties are subject to first-order corrections. This is the usual 
situation for traditional Hartree-Fock calculations in open-shell configura¬ 
tions (Nesbet, 1955, 1961). The m s equivalence restriction in Li( 2 5) has also 
been discussed by Pratt (1956). 

Relaxation of the m s equivalence restriction is especially important in 
calculating the Fermi contact term (Fermi, 1930) in magnetic hyperfine 
structure, 

a, = f(^)[p,(°)-M°)]. 0) 

Here /q is the nuclear magnetic moment, I the nuclear spin, q B the Bohr 
magneton, J the total electronic angular momentum, p+(0) the m s = +1/2 
electron density at the nucleus, and p_(0) the m s — —1/2 density at the 
nucleus. In atomic S'-states (L = 0) this is the only contribution to the mag¬ 
netic hyperfine structure. Since one-electron wave functions (orbitals) with 
/ greater than zero vanish at the nucleus, a s depends only on the atomic 
s-orbitals. Under m s equivalence restriction, inner closed-shell orbitals are 
doubly occupied, and contributions to Eq. (1) cancel out identically. If 
s-orbitals of different spin are allowed to have different radial functions there 
can be a net contribution from the inner shell orbitals, induced by an in¬ 
complete outer shell. When the incomplete shell contains no s-orbitals, as in 
the ls 2 2s 2 2p 3 ( 4 S) ground state configuration of nitrogen, the Fermi contact 
term vanishes in the traditional Hartree-Fock approximation, although a 
quite large magnetic hyperfine splitting is observed for N( A S). To compute 
this splitting it is necessary to take into account the polarization of the nomi¬ 
nally closed inner shells by the unsymmetrical exchange interaction with the 
half-filled 2p 3 shell. 

This paper will be concerned with calculations of a s for atoms in S'-states, 
in particular for the 2 S ground states of Li and Na and for the 4 S ground 
states of N and P. Since L vanishes, the UHF method for these states is 
equivalent to the spin-polarized Hartree-Fock (SPHF) method, in which the 
m s equivalence restriction is relaxed without reference to other symmetry or 
equivalence restrictions. The matrix form of the SPHF equations is due to 
Pople and Nesbet (1954) and to Berthier (1954), and these equations have been 
used for calculations reported here. The terminology “spin-polarized” was 
suggested by Watson and Freeman (1960), who carried out SPHF calculations 
on the 3d 8 ( 3 F) ground state of Ni + + . No true UHF calculation has yet been 
published except for atomic 5-states. 

While, by the usual variational criterion, the UHF equations lead to a 
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better wave function than does the traditional Hartree-Fock method, the 
UHF function is in general a mixture of states of definite symmetry ( 2 S and 
4 S' for the ground state of Li), and the dependence of orbital radial factors 
on m s and m x makes the calculation of corrections to the UHF approximation 
considerably more awkward than in the traditional theory. To avoid these 
difficulties, it was suggested that it might be more convenient to use perturba¬ 
tion theory to evaluate effects of the one-particle excitation matrix elements 
that occur as a consequence of symmetry and equivalence restrictions than to 
use the UHF method directly (Nesbet, 1955). More detailed arguments 
(Marshall, 1961; Bessis et al, 1961) have shown that these alternative ap¬ 
proaches should lead to essentially identical results, although in the UHF 
method the trial wave function is not an angular momentum eigenfunction, 
while it is in the perturbation method. The value of a s obtained by projecting 
an angular momentum eigenfunction out of the UHF function is substantially 
different from the UHF or perturbation theory value. Without a direct 
variational calculation of the projected function, the projected value of a s 
has little theoretical justification. 

In the case of 31 P( 4 6') a very serious disagreement exists between the 
experimental value of a s and theoretical values computed by the UHF or 
perturbation methods (Bessis et al., 1964). Theoretical results on atoms with 
2 S or 4 S ground states are reviewed in Section II, below, and some new 
UHF calculations are reported which confirm the previously published work. 
In Section III, a new method is proposed for computing one-electron proper¬ 
ties of atoms to high accuracy, and preliminary results on Li( 2 S) are reported. 

II. Unrestricted Hartree-Fock Calculations for Li, N, Na, and P 

Since Eq. (1) contains two experimental quantities, a s and p„ it is convenient 
to define the experimental Fermi contact parameter 


( 2 ) 

( 3 ) 


/= 47t(p + (0) - p_(0)) 
= IJaJl 1.8027 p, 


in units a 3 if a s is given in megacycles per second and p, in nuclear magnetons. 
The constant in Eq. (3) is computed from Eq. (1), using recently tabulated 
values of fundamental constants (NAS-NRC Committee, 1964). The para¬ 
meter/computed for a single determinant wave function is the sum of squared 
amplitudes of radial factors of s-orbitals with spin m s = +1/2 minus the 
corresponding sum for spin m s = — 1/2, evaluated at r = 0. 

In Li( 2 S') the value of/computed in the traditional Hartree-Fock approxi¬ 
mation is only 2.07 a 3 , compared with the observed value, 2.91 a o 3 . Calcula¬ 
tions by both the perturbation (Nesbet, 1956, 1960) and UHF (Sachs, 1960) 
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methods are in reasonable agreement with experiment, showing that the 
relatively large correction to the traditional Hartree-Fock result is almost 
entirely due to the exchange polarization effect. A similar result is obtained 
by Cohen et al. (1959), who treat the unbalanced exchange integral as a 
perturbation in the ordinary Hartree-Fock equations, solved by numerical 
methods. 

To verify this work, and to provide computations of comparable accuracy 
for Li( 2 .S), N( 4 .S), Na( 2 .S), and P( 4 .S), new UHF calculations have been carried 
out using orbital basis sets that give accurate traditional Hartree-Fock 
functions (dementi, 1965). Results are tabulated in Table 1. It can be seen 
that there is a striking disagreement with experiment for both N( 4 S)and P( 4 6'), 
and that the error in/ for Na( 2 5) is considerably larger than that for Li( 2 5). 

SPHF calculations, using Hartree’s numerical integration method, were 
carried out on a number of atoms by Goodings (1961). It was found that / 
for N( 4 5) is much greater than the observed value, and is comparable to the 
computed value given in Table 1. Results for Li and Na are similar to those 
given here. 

A number of calculations on N( 4 5) were reported by Bessis et al. (1961). 
The best UHF result for / (lowest variational energy) agrees with the result 
in Table 1, three times greater than the observed Fermi contact parameter. 
However, several different perturbation calculations gave results much closer 
to experiment. Configuration interaction calculations that included pair 

TABLE 1 


Fermi Contact Parameters f(a 0 3 ) and Computed Hartree-Fock Energies 
£(Hartree Atomic units, e 2 / a 0 ) 


Atomic state 

£(HF) 4 

£(UHF) 6 

/(UHF) 6 

/(obs) c 

Li ( 2 S) 

-7.432726 

-7.432745 

2.923 

2.90960 

N ( 4 S) 

-54.400904 

-54.403838 

3.753 

1.22099 

Na( 2 S) 

-161.85857 

-161.85857 

8.210 

9.42027 

P ( 4 S) 

-340.71857 

-340.71878 

-2.240 

1.14736 


“ Clementi (1965). HF refers to the traditional Hartree-Fock method. 
6 Computed with same orbital basis set as Clementi (1965). 
c Obtained by Eq. (3) from the following data: 

/u.,( 7 Li) = 3.256310 nm (Ramsey, 1956) 

a( 7 Li; 2 S) = 401.756 Mc/sec (Kusch and Taub, 1949) 

Mi ( 14 N) = 0.40371 nm (Ramsey, 1956) 

a( 14 N; 4 S) = 10.45091 Mc/sec (Anderson et al., 1959) 

fii( 23 Na) = 2.21753 nm (Ramsey, 1956) 

a( 23 Na; 2 S) = 885.80 Mc/sec (Kusch and Taub, 1949) 

Mi( 31 P)= 1.13162 nm (Ramsey, 1956) 

a( 3I P; 4 .S') = 55.0557 Mc/sec (Lambert and Pipkin, 1962) 
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correlation terms as well as the exchange polarization effect appeared to give 
a value of/ close to experiment. 

Calculations on Na( 2 5') by Cohen et al. (1959) and by Goodings (1961) 
give a result for/similar to that in Table 1. The disagreement with experiment 
is greater for Na than for Li, but not so striking as it is for N and P. Calcula¬ 
tions on P( 4 .S) by both the perturbation and UHF methods (Bessis et al., 
1964) consistently give negative values of /. The experimental sign of a s for 
31 P( 4 S) has recently been redetermined (Pendlebury and Smith, 1964), 
verifying the positive sign of /indicated in Table 1. 

There appears to be a large qualitative difference between the accuracy of 
the UHF results for s ( 2 S) and p 3 ( 4 *S) electronic configurations. In view of the 
serious disagreement with experiment for both N( 4 5) and P( 4 S), it is reason¬ 
able to look for correlation corrections to / (improvements beyond the UHF 
approximation) that are sensitive to the structure of the electronic valence 
shell. A new method for analysis and computation of such corrections is 
presented in the following section. 


III. Use of Bethe-Goldstone Equations to Compute One-Electron Properties 

of Atoms 

In principle one can represent a stationary state many-electron wave func¬ 
tion to arbitrary accuracy as a linear combination of the complete orthonormal 
set of Slater determinants 

$?#:::> i <j<k<--- <N<a<b<c< •••, (4) 

constructed from a complete orthonormal set of orbital functions that includes 
the N occupied orbitals <£; of a reference state Slater determinant 

<t> 0 = det (pfl) ••• <t>i(i ) ••• MN), ( 5 ) 

where det represents the total antisymmetrizing operator, with a normaliza¬ 
tion factor (A!) _1/2 . The Slater determinant indicated in Eq. (4) is obtained 
from <E> 0 by replacing occupied orbitals </>,-, 4>j, ...,by unoccupied orbitals 
(j) a , (j) b , ... from the complete set. If the Slater determinants of Eq. (4) are 
denoted in general by 4/, the coefficients c„ in the expansion of an exact wave 
function 

v = L.V, (6) 

are obtained by the Rayleigh-Ritz variational principle, and occur as compon¬ 
ents of eigenvectors of the configuration interaction matrix 

// V = (<D„, /74> v ), 


( 7 ) 


162 


R. K. NESBET 


where H is the many-electron Hamiltonian. The same formalism applies in a 
finite orbital basis, and T approaches an exact wave function as such a basis 
is extended to become complete. 

The Bethe-Goldstone equation (Bethe and Goldstone, 1957) is the time 
independent Schrodinger equation for two particles of an ^-particle system. 
The interaction with the remaining N-2 particles is represented by a self- 
consistent field analogous to that of the Hartree-Fock theory, and by an 
orthogonality constraint. In the theory of Brueckner (Brueckner et al., 1954; 
Brueckner, 1954, 1955; Brueckner and Levinson, 1955; Brueckner and Wada, 
1956) the equivalent integral equations are derived from multiple scattering 
formalism. Given N orthonormal orbitals for an TV-particle system, Bethe- 
Goldstone equations are solved independently for each possible pair of 
orbitals. 

In terms of configuration interaction, with reference state d> 0 determined 
by N specified occupied orbitals, the Bethe-Goldstone equation for pair ij 
is just the .variational equation for a trial wave function. 

+ I b <D b jC ) + 0) fjcfj. (8) 

The configuration interaction matrix is diagonalized over the set of Slater 
determinants O 0 , $?, for a specified pair ij. The energy eigenvalue is 

expressed as H 00 + E tj , defining an energy increment or pair correlation 
energy for orbital pair ij. If O 0 is a Hartree-Fock function (Nesbet, 1958), the 
total correlation energy is approximated by 

When expressed in this form, it is clear that the Bethe-Goldstone equation 
is a special case of a more general concept, which might be characterized as 
Bethe-Goldstone equations of order n, referring to the exact solution of an n- 
particle problem, subject to the constraint of strong orthogonality to N-n 
orbitals of a specified orthonormal set. Thus, in matrix form, the third order 
Bethe-Goldstone equation for triplet ijk is equivalent to diagonalization of 
the configuration interaction matrix over the set of determinants d> 0 , <£?, 

^jk fora specified set of three orbitals 0 ; , <f>j, <fi k . By 
definition orbitals (f> a , <j> b , <j) c are orthogonal to the N orbitals {<£,} occupied 
in 0> o . The sequence of calculations indicated by solving Bethe-Goldstone 
equations of successively higher order eventually terminates in an exact 
solution of the iV-particle problem. The increments of energy or any other 
physical quantity obtained in successively higher orders form a series whose 
sum is the exact value for a stationary state wave function. If this series 
converges sufficiently rapidly (to 2nd or 3rd order terms) this procedure would 
provide a practicable computational method. 

Preliminary results for the Fermi contact parameter/for Li( 2 S) obtained by 
this method are given in Table 2. The orbital basis set used is extremely 
limited, consisting only of the five s-orbitals used by Clementi (1965) to 
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TABLE 2 


Analysis of Fermi Contact Parameter for Li( 2 5) 


Indices 1, 2, 3 denote lsjS, Isa, 2sa, respectively. 


fo 


2.066865 ao 3 


ft = Ft~ fo 
fi =F 2 -f 0 
h =F 3 -/ 0 



0.707176 

0.000013 

0.000007 

0.090140 

0.060908 

-0.000187 



-0.066832 

2.858090 

2.90960 


expand the occupied Hartree-Fock orbitals. In Table 2, the notation F,;... is 
used for the computed mean value of the one-electron operator F whose mean 
value for a single Slater determinant is the parameter / of Eq. (2). F u ... is 
computed in principle from the Bethe-Goldstone wave functions 4 
analogous to Eq. (8). Because all coefficients c M (ji # 0) are small in the present 
calculations, squares of these coefficients have been neglected, giving the 
approximate formula, if all numbers are real, 


Fij... F 0 = 2^ #1 F 0#J c #1 , 


where p ^ 0 in the primed summation. Equation (9) is used to compute the 
quantities in Table 2. 

It is clear from Table 2 that the present method provides a very detailed 
analysis of properties of a many-electron system. The dominant correction to 
the traditional Hartree-Fock value,/ 0 , is the exchange polarization term,/!. 
The two significant pair-correlation terms/ 12 and/ 13 are an order of magnitude 
smaller. 


IV. Discussion 


Although the calculation on Li( 2 5) reported in Table 2 is incomplete, using 
only a limited basis set of s-orbitals, the method used is capable of giving 
definitive results. Each of the quantities F u ... is obtained from a wave function 
q/.. that is obtained variationally. For this reason each of the quantities 
fj has a well-defined limit that can be approached by a series of variational 
calculations. In terms of matrix calculations with limited orbital basis sets, 
a separate basis set can be chosen for each of these variational calculations, if 
necessary, and a given/;... will be known to have converged to a specified 
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accuracy if further additions to the orbital basis sets do not influence a specified 
significant digit. In this way the work of computation can be concentrated 
on the particular values that are largest or most sensitive to changes 
in the orbital basis set. 

For Li( 2 S), orbitals with / > 0 can contribute only to second- or third-order 
Bethe-Goldstone equations. With the basis set used here (all s-orbitals), / 0 
and f l should be close to their limiting values. f 2 and/ 3 can be made to vanish 
identically by tightening the convergence criterion of the Hartree-Fock 
calculation. Further changes due to electronic correlation will occur in 
/i 2 ,/i 3 ,/ 2 3 , and/j 23 - It seems very likely that f x will continue to dominate the 
final values of these quantities. 

This conclusion is in apparent disagreement with Berggren and Wood (1961), 
who argue on the basis of calculations with nonorthogonal three-electron 
functions that correlation effects are more important than exchange polariza¬ 
tion in determining/for Li( 2 5). This perhaps is only a question of terminol¬ 
ogy, since there is no simple way to analyze the wave functions of Berggren 
and Wood to give terms analogous to those in Table 1. The characterization 
of/x as an exchange polarization effect and of f 12 and as pair correlation 
effects is based on Brillouin’s theorem, discussed in Section I, above. 

A method of computation of correlation energy closely related to the method 
of Brueckner, Bethe, and Goldstone has been proposed by Sinanoglu (1962a,b, 
1964). The method proposed here, which is based directly on nth order 
Bethe-Goldstone equations, has the advantage of defining a computational 
procedure with an exact solution of the Schrodinger equation as the «th 
order limit, and of providing an algorithm for the computation of physical 
properties other than the energy. 

Results of calculations on Li, N, Na, and P by the method of Section III 
will be reported in a subsequent publication. 

REFERENCES 

Anderson, L. W., Pipkin, F. M., and Baird, Jr., J. C. (1959). Phys. Rev. 116, 87. 

Berggren, K. F., and Wood, R. F. (1961). Phys. Rev. 130, 198. 

Berthier, G. (1954). J. Chim. Phys. 51, 363. 

Bessis, N., Lefebvre-Brion, H., and Moser, C. M. (1961). Phys. Rev. 124, 1124. 

Bessis, N.. Lefebvre-Brion, H„ Moser, C. M., Freeman, A. J., Nesbet, R. K„ and Watson, 
R. E. (1964). Phys. Rev. 135. A588. 

Bethe, H. A., and Goldstone, J. (1957). Proc. Roy. Soc. (London ) A238, 551. 

Brillouin, L. (1934). Actualites Sci. Ind. 159. Hermann et Cie., Paris. 

Brueckner, K. A. (1954). Phys. Rev. 96, 508. 

Brueckner, K. A. (1955). Phys. Rev. 97, 1353. 

Brueckner, K. A., and Levinson, C. A. (1955). Phys. Rev. 97, 1344. 

Brueckner, K. A., and Wada, W. (1956). Phys. Rev. 103, 1008. 

Brueckner, K. A., Levinson, C. A., and Mahmoud, H. M. (1954). Phys. Rev. 95, 217. 


Computation of Magnetic Hyper fine Structure of Atomic S- States 165 


Clementi, E. (1965). IBM J. Res. Develop. 9, 2. Suppl. Tables 4-1, 10-1, 18-1, 24-1. 

Cohen, M. H., Goodings, D. A., and Heine, V. (1959). Proc. Phys. Soc. {London) 73, 811. 
Fermi, E. (1930). Z. Physik 60, 320. 

Goodings, D. A. (1961). Phys. Rev. 123, 1706. 

Kusch, P., and Taub, H. (1949). Phys. Rev. 75, 1477. 

Lambert, R. H., and Pipkin, F. M. (1962). Phys. Rev. 128, 198. 

Marshall, W. (1961). Proc. Phys. Soc. {London) A78, 113. 

Moiler, C., and Plesset, M. S. (1934). Phys. Rev. 46, 618. 

NAS-NRC Committee (1964). Physics Today 17, 48. 

Nesbet, R. K. (1955). Proc. Roy. Soc. {London) A230, 312. 

Nesbet, R. K. (1956). Quarterly Progress Report , Solid State and Molecular Theory Group , 
MIT. July 15, p. 3, Oct. 15, p. 47. Unpublished. 

Nesbet, R. K. (1958). Phys. Rev. 109, 1632. 

Nesbet, R. K. (1960). Phys. Rev. 118, 681. 

Nesbet, R. K. (1961). Rev. Mod. Phys. 33, 28. 

Pendlebury, J. M., and Smith, K. F. (1964). Proc. Phys. Soc. {London) 84, 849. 

Pople, J. A., and Nesbet, R. K. (1954). J. Chem . Phys. 22, 571. 

Pratt, Jr., G. W. (1956). Phys. Rev. 102, 1303. 

Ramsey, N. F. (1956). “Molecular Beams,” p. 172. Oxford Univ. Press, London and New 
York. 

Sachs, L. M. (1960). Phys. Rev. 117, 1504. 

Sinanoglu, O. (1962a). J. Chem. Phys. 36, 706. 

Sinanoglu, O. (1962b). J. Chem. Phys. 36, 3198. 

Sinanoglu, O. (1964). Adv. Chem. Phys. 6, 315. 

Slater, J. C. (1951). Phys. Rev. 82, 538. 

Watson, R. E., and Freeman, A. J. (1960). Phys. Rev. 120, 1125. 


Application of Quantum Theory to Atomic Processes 
Occurring in Planetary Nebulae 


S. J. CZYZAK and T. K. KRUEGER 

GENERAL PHYSICS LABORATORY, AEROSPACE RESEARCH LABORATORIES 
WRIGHT-PATTERSON AIR FORCE BASE, DAYTON, OHIO 


I. Introduction. 

II. Planetary Nebulae. 

III. Forbidden Transitions and the Calculation of Atomic Parameters 
References. 


167 

169 

176 

183 


I. Introduction 

Considerable progress has been made in recent years on the understanding 
of nebulae, especially the planetaries; this is due, in some measure, to the 
advances made in the theoretical investigation of atomic processes which occur 
in nebulae. We shall not attempt to discuss what is already known about these 
objects because this material has been thoroughly covered in the literature and 
summarized in a number of reviews (see Aller, 1956; Dufay, 1954; Osterbrock, 
1964; Seaton, 1960; Wurm, 1951; Vorontsov-Velyaminov, 1953);but,instead 
we shall be primarily concerned with the advances that have been made in 
calculating atomic parameters which are used in determining some of the 
properties of nebulae, i.e., the emphasis will be on the spectroscopic and hence 
the atomic structure problems. 

Interstellar matter may be divided, somewhat superficially, into three general 
categories, namely, interstellar matter, dark nebulae, and luminous nebulae. 
The last of these can be further separated into diffuse (or irregular) and planetary 
nebulae. Interstellar matter consists of gas and dust in interstellar space, the 
gas being primarily hydrogen and helium (approximately 96 / 0 by mass) and 
the dust consists of microscopic solid grains (of the order of 10 5 cm) believed 
to be mostly dielectric compounds of hydrogen and other common elements. 
The interstellar matter is not homogeneously distributed (patchy) throughout 
space, but instead tends to be more dense in some areas than in others, i.e., it 
tends to form clouds or nebulae. These nebulae may or may not be visible. 
This depends on whether there are stars imbedded in the nebula to supply the 
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light source, as for example in Orion nebula, or whether a strong star field is 
in the region to silhouette the nebula against the star field, as in the case of 
the Horsehead nebula. 

The existence of interstellar gas and dust has been abundantly confirmed by 
the analysis of the spectra and luminosity of certain B stars. By using distant 
B stars near the galactic plane, analysis reveals that the B stars are redder than 
they should be and also show a few very sharp, but faint, lines that are due to 
the interstellar gas. These lines were first noted in the spectra of spectroscopic 
binaries because such lines did not take part in the periodic oscillation of the 
stellar lines. Common interstellar lines are due to the H and K lines of Ca + , 
Ca, K, Na, and Ti + . There are also interstellar molecular bands some of which 
have been identified as CH, CN, and CH + . The reddening, on the other hand, 
is caused by the interstellar dust, which scatters the star light before it reaches 
the observer. 

As implied above, nonluminous clouds of solid dustlike material and gas 
occur between the stars in many parts of the spiral arms of our galaxy and other 
galaxies. If a cloud lies in front of a dense star field it shows up as a silhouette 
which is a dark nebula. Dark nebulae reflect some light of the very distant 
stars but the reflected light intensity is so low that they are black for all practical 
purposes. Their existence is determined by the fact that they partially conceal 
luminous objects behind them. There is also evidence of the coexistence of 
dark and bright nebulae in the same region, as for example in the region of 
the star Rho Ophiuchi. 

The luminous nebulae as well as the dark nebulae are not self-luminous; it 
is the stars imbedded in the nebula that are the primary source for illumination. 
Diffuse nebulae are associated with fairly luminous stars, and it is through one 
of two processes caused by the emission of light from a star or stars that cause 
the nebula to appear as a bright object. These nebulae are associated with 
Type I population and they are normally irregular in form, often of low 
density and sometimes of considerable size, such as Orion, 30 Doradus in the 
Large Magellanic Cloud and NGC 604 in the Triangulum Spiral. The two 
processes, either of which will illuminate a nebula are: (1) reflection of star¬ 
light and (2) fluorescence. Those nebulae that are illuminated by the reflection 
of starlight normally have stars of the B2 or later spectral class imbedded in 
them. These stars emit radiation whose energy maximum is insufficient to excite 
or ionize the nebular atoms, i.e., they have too little radiation beyond the 
Lyman limit to ionize hydrogen and the other atoms in the large volume of 
the surrounding nebula. Hence, the light from these stars is reflected by the 
nebula as is indicated by the spectrograms in that they show a typical absorp¬ 
tion line spectra that would be observed in stars. These nebulae are often 
referred to as reflection nebulae. Thus if stars of this type were imbedded in a 
dark nebula it would appear as a reflection nebula. Some of these nebulae are 
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very large so that only the central parts are illuminated by stars. Reflection 
nebulae which are star-illuminated dark nebulae have been identified; an 
example of these is the nebula NGC 7023. For this case, the outer region appears 
as a dark nebula that almost completely masks the stars beyond it. There are 
of course no known totally opaque nebulae; the absorption range varies from 
30 to 95 %. The illumination process causing fluorescence is due to the light 
from B1 or earlier spectral class stars and if the nebula is examined spectro¬ 
scopically a bright line spectrum is obtained. These stars are very hot 
(> 25,000°K), and starlight emitted from them is largely in the ultraviolet 
region (energy peak at approximately 900 A). Here enough energy radiation 
beyond the limit of the Lyman series is available to excite or ionize hydrogen 
and the other nebular atoms. The atoms either absorb this radiation and emit 
light of longer wavelengths by variously cascading from one energy level to 
another and finally to the ground state yielding the characteristic emission 
spectrum, or else if the energy available is sufficiently high the atoms may be 
ionized one or more times. The latter, a photoionization process, is one 
of the most important atomic processes that occurs in the nebula. Nebulae 
in which the fluorescence process occurs are often referred to as 
emission nebulae. 

Planetary nebulae are associated with Type II population and are often 
symmetrical in form and smaller than the diffuse nebulae. Many have a higher 
surface brightness and higher density than the better known diffuse nebulae. 
Just as in the case of diffuse nebulae illumination is due to the fluorescence 
process. Of the approximately 700 planetary nebulae known in our galaxy, 
spectra and direct photographs are available for about half of them. Detailed 
investigations have been carried out on approximately 30 of them. 


II. Planetary Nebulae 

Planetary nebulae, as indicated earlier, are clouds of ionized gas which 
receive their radiation from certain hot central stars contained within them. 
The ultraviolet radiation from the central star is absorbed by the surrounding 
gas exciting and/or ionizing the atoms within the nebula which then emit light. 
The emitted light gives the nebula a pale green disklike appearance resembling 
that of the planets Uranus and Neptune, hence the name planetary. Planetary 
nebulae are Type II population objects, although not members of the extreme 
halo branch, which are fairly small masses of gas, i.e., the masses vary over a 
wide range (probably 0.01 to 0.3 solar masses) but never exceed one solar mass, 
and are not strongly concentrated towards the galactic plane. The physical 
dimensions of these nebulae are of the order of 10 17 cm or 10 6 solar radii. 
The radiation from the star is diluted by a factor of (rJ2R) , where r * is the 
radius of the star and R is the distance from the star to nebula. The electron 
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temperatures T e are of the order of 10 4o K, and the electron densities are of 
the order of 10 4 cm -3 . Chemically, the nebula consists mainly of hydrogen 
with helium being the next most abundant element (approximately 1/6 that 
of hydrogen) and all other elements being even less abundant, as for example 
the important element oxygen which is present in a ratio of 1/1000 to that of 
hydrogen. The atomic processes occurring in planetary nebulae are quite 
similar to those in diffuse nebulae. The spectrum of a planetary contains 
emission lines and a background continuum. Thus, in a study of a planetary 
nebular spectrum one would hope to obtain information about the temperature, 
density, and chemical composition of the object, as well as the nature of the 
diluted radiation received from the central star, based on the atomic processes 
taking place in the nebula (a low-density plasma). The unique feature of 
planetary nebulae is the close relation between the central star and the nebula. 
Unlike the other types of nebulae, the nebular gas most probably has been 
ejected from the central star, with its mass being a small fraction of 
that of the star, and having an expansion velocity of the order of 20 
km/sec. 

The central stars or nuclei in planetaries cover a wide range of surface tem¬ 
peratures, hence very little information can be obtained regarding their true 
luminosities from their magnitudes alone. These nuclei are all hot stars with 
the temperatures ranging upwards from 25,000°K. The spectra of these nuclei 
appear to be similar to certain stars of the Type I population although they 
themselves belong to the Type II population. Their spectra may be divided 
into four main groups, namely the Wolf-Rayet type, the Of type with combined 
emission and absorption lines, the O-type with absorption lines only, and 
continuous spectra without absorption or emission lines observed. A very 
thorough and detailed exposition of stars in nebulae has been given by Aller 
(1956). 

To interpret the central stars in nebulae, it is necessary to obtain quantitative 
spectrophotometric data in order to study the profiles and intensities of the 
absorption and emission lines in the spectra of these nuclei. Excitation tem¬ 
peratures and chemical composition of these nuclei can then be estimated 
from the absorption or emission lines. One may also deduce the central star 
temperatures from the nebular spectrum using both allowed and forbidden 
nebular lines (see Menzel, 1962). 

If the n uclei of the planetary nebulae are assumed to radiate as black bodies, 
then the density of radiation p* in the thermodynamic equilibrium at the star 
temperature T * is 

+ Snhv 3 1 

c 3 Qxp(hv/kT *) — 1' ^ 

At the surface of the star the density of radiation is |p*, but as the distance 
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from the surface of star is increased the radiation is diluted by a ratio of 
i(r*/R) 2 for a nebula, so that Eq. (1) becomes 



Thus, on the basis of this equation there are three parameters to determine, 
namely, the star radius r*, star temperature T *, and the distance R from the 
center. It is still necessary to take into account the direction in which the 
radiation process takes place. Rosseland’s theorem states that, for a dilution 
of radiation, quanta of high frequency are more likely to transform into quanta 
of low frequency than the converse. In a nebula the converse may be completely 
neglected and only transformation from high to low frequency need be con¬ 
sidered. Since central stars are very hot, part of the energy in the ultraviolet 
region which is transformed into visible radiation by the nebula exceeds the 
energy that is emitted by the central star in the visible region, so that the nebulae 
have greater visual luminosities than their nuclei. 

It has been assumed in Eq. (2) that the nebula does not contribute anything 
to the radiation, whereas it is the nebula that supplies the diffuse radiation 
from the hydrogen, helium, and other elements within it, by photoionization 
and recombination of H I, He I, and He II and by the forbidden transitions 
of the heavier elements. As a matter of fact, the temperatures of planetary 
nuclei were first estimated on theories of excitation of the nebula. If a nebular 
shell is optically thick* then the radiation beyond the Lyman limit emitted by 
the nucleus would be absorbed. Zanstra (1927, 1931a,b) showed that for H I 
each Lyman continuum quanta would be transformed into a Lyman a (Ly*) 
quantum and a Balmer series quantum or continuum. Thus the total 
number of quanta emitted by the central star beyond the Lyman limit (Ly*) 
must equal the total number of quanta in the Balmer series plus the continuum, 
i.e., N Ba+Bac = N* yc . In general it can be stated that the number of Balmer 
quanta emitted by the nebula does not exceed the number of Ly* quanta 
emitted by the central star, or N Ba ^ A* yc . The N Ba+Bac then gives a lower 
limit to the number of A£ yc . By comparing the intensities of nebular Balmer 
lines with that of the underlying stellar continuum the temperature of the 
central star can be estimated, assuming that it radiates as a black body. 
Analogous reasoning may be employed for cases where He II lines are ob¬ 
served except, of course, here the Paschen and Brackett series are observed 
rather than the Lyman and Balmer ones. Thus the temperatures may be 
obtained from both H I and He II. 

* Optical thickness is defined asr =J7r v p ds, when k v is the absorption coefficient, p is 
the density of the absorbing material, and s is the thickness of the absorbing layer. 
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The total number of quanta emitted by the star beyond the Lyman limit is 


*Ly c = 


Sn 2 r 


2„2 


r 

* VO 


v 2 dv 


exp (hv/kT*) - 1 ’ 


( 3 ) 


where v 0 is the frequency of the Lyman series limit. From observations one 
can determine the quantity 


A t = 


E, 


v i (8Ejdv) i ’ 


( 4 ) 


where E { is the total amount of energy emitted per unit time by the entire 
nebula in the /th Balmer line, and (dEJdv) i is the total amount of energy 
emitted per unit time and frequency by the underlying stellar continuum at 
the same frequency. From these quantities the dimensionless quantity A { is 
obtained. 

Since 


(£),- 


Sn rlhvf 


1 


c 2 exp (hvi/kTJ - 1 ’ 

the total number of Balmer quanta emitted by the nebula is: 

v? A i 




exp {hvJkT*) - 1' 


From the expression N Ba+Bac , ^N Lyc we get: 
„ v?A- 

i ^ t 




v 2 dv 


exp (fiVi/kT*) - 1 J vo exp (hv/kT*) - 1 


or 


y xfA, < p x~ ax 
L e x <-1 ~ J Xo e x - 1’ 


00 x 2 dx 


( 5 ) 


( 6 ) 


( 7 ) 


( 8 ) 


where 


a* = h v/kT *; ,v 0 = h vjk T*; x t = h vJkT * 

and the summation of the left side is over the Balmer lines and continuum. 

If the temperature is to be determined from the forbidden lines, as for 
example the two nebulium lines (N x and N 2 ), the expression is analogous to 
Eq. (8) and was obtained by Zanstra in essentially the following way: 

Suppose that free electrons are obtained by the ionization of hydrogen 
atoms. If this is the case then the electron will acquire a kinetic energy of 

\mv 2 = hv — hv 0 


( 9 ) 
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by the absorption of a quantum of frequency v where the H ionization 
frequency is v 0 . In a frequency interval v to v + dv the total quanta emitted 
per unit time will be 


_ d -l _. 

c 2 exp(hv/kT*) — 1 


( 10 ) 


On the other hand the total amount of kinetic energy obtained by the electrons 
per unit time by the absorption of all the Ly* quanta from the star becomes 


8n 


,rjh r m 
c 2 Jvo 


(v - v 0 )v 2 


exp (hvjkTf) - 1 


dv. 


do 


From the forbidden O III lines (A^ and N 2 ) in the nebula the energy emitted 
is 


87i 2 r»/» „_vf_ 

c 2 ^ exp(/iv ijkTf) — 1 




( 12 ) 


where the summation is over the nebulium lines and A t is determined from 
observation [see Eq. (4)]. By Zanstra’s theorem the energy acquired to excite 
the forbidden O III lines < the kinetic energy obtained by the electrons; there¬ 
fore 


v*A • 

v l ** l 


' exp (hvJkTf) 


— <f 

-1 Ko 


(v -v 0 )v 2 


exp (hvjkTf) — 1 


dv 


(13) 


or 



( x - x ° )x f x 


(14) 


Thus by means of Eqs. (2), (8), or (14) the temperature of the central star may 
be determined. In addition to these there are several additional techniques for 
determining central star temperatures, namely, the method of Stoy (1933), 
Wurm’s (1951) Balmer continuum method, and the spectral class method. In 
Stoy’s method a comparison is made of the intensities of the Balmer nebular 
lines and forbidden lines. The basic assumption is the same as that made by 
Zanstra [which yields Eq. (14)] in his “nebulium” method, i.e., the energy 
brought into the continuum by the electrons photoelectrically detached from 
hydrogen is all dissipated in inelastic collisions with ions of O III, N II, S II, 
etc. Wurm, on the other hand, derived the temperature of the central star from 
the ratio of the intensity in the nebular continuum at the Balmer limit to that 
of the continuum of the central star. The ratio depends on the central star 
temperature, the fraction of the captures on the second level, and on the optical 
thickness of the nebula. Petrie (1947) applied the spectral class method to 
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Population Type I stars and Wilson and Aller (Aller, 1948; Wilson, 1948; 
Wilson and Aller, 1954) to nebulae. For this method equivalent widths fV/s of 
hydrogen and helium lines are used to determine the spectral classes, i.e., by 
measuring the JV/s one can read off the spectral classes. For most central stars 
in planetary nebulae the He I/He II or the He II/H ratios may be employed 
provided the blending effects with nebular lines is taken into account. In this 
method, in which Petrie calibrated the excitation temperatures of normal O- 
type stars, it is assumed that the helium/hydrogen ratio is the same in all stars. 
Of the various methods which are available for determining the temperature of 
the central star the methods of Zanstra, Stoy, and Petrie are used most often. 

The spectral lines arising in the nebula are due to the following three 
mechanisms: 

(a) The primary physical process that occurs in the nebula itself is the 
photoionization mechanism due to the absorption of stellar ultraviolet radia¬ 
tion by the surrounding gas cloud, i.e., 

H +/jv- H + + e~, (15) 


and occurs in atoms in the ground state. The electrons are then recaptured 
leaving the atoms in highly excited levels from which they cascade to lower 
levels emitting lines of allowed transitions. While this recombination occurs 
primarily in hydrogen and helium, it also has been observed in the heavier 
elements such as oxygen, carbon, and nitrogen. 

The emission per unit volume for an n-*n' transition in hydrogen or ionized 
helium may be written in the following form: 


or 


E(n -*• n') = 2 kh 


Z 6 g(n~n-)N,NMT,) 

n'Vr , 3 ' 2 


E(n-n ) = 1.42 x 10 


Z 6 g(n^„')N,NMT,) 

n ,3 n 3 T? /2 


e i 57,000 (16) 


Here n and n‘ denote the upper and lower levels, respectively, N t the number 
of ions of hydrogen or doubly ionized helium per cm 3 , N e the electronic con¬ 
centration, Z the nuclear charge, T e the electron temperature, g(n-*ri) the 
Gaunt correction factor, which for most applications may be taken as unity, 
and b„(T e ) is a measure of the deviation from thermodynamic equilibrium 
(see Seaton, 1960; Baker and Menzel, 1938; Burgess, 1958). The T E and N t of 
the gas usually can be determined from the relative forbidden line intensities. 
By measuring the surface brightness of a nebula at a known distance an estimate 
of the energy, electron, or ion density can be obtained. Usually it is possible 
to equate the 1 V ; of hydrogen with the N e without too much loss in accuracy, 
since the hydrogen is very abundant in relation to helium and the other ions. 
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A similar theory of recombination has been developed for helium and other 
ions by Mathis (1957) and Seaton (1964) and by Burgess and Seaton (1960), 
respectively. 

(b) Collisional excitations are responsible for the strongest lines in most 
planetary nebulae. These lines are due to the well-known forbidden transitions, 
first explained by Bowen (1928), which arise from the collisional excitation of 
the metastable levels that lie in a few electron volts above the ground level. 
After the electrons have been excited to the metastable levels by inelastic 
collisions they cascade back to a lower level with the emission of a forbidden 
quanta of the magnetic dipole or electric quadrupole type or both; or by a 
collision of the second kind. 


3/2 

1/2 


5 / 2 . 
3 / 2 * 


3 / 2 - 


4 S 


p2 p* P 4 

Fig. 1. The forbidden transitions for p“ configurations. 

In Fig. 1 the forbidden transition scheme is shown for the p q type con¬ 
figurations (q = 2, 3, and 4) which are some of the important ones in astro¬ 
physics. , 

The number of collisional excitations/cm 3 / sec from a lower level n to an 

upper level n depends on the ion density in level the electron density, the 
electron temperature, the excitation potential, x„<„, of the upper level, the 
statistical weight, of the lower level, and the collision strength, Q(n, n ), 
of the particular ion and the transition involved, viz., 


JV„ = 8.63 x 10 - 6 N n ,NJ, 


— 1/2 Q(rc'>») -s^/fcTe 




(17) 


The corresponding number of collisions of the second kind per cm 3 per sec is: 


, Cl(n,n') 
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and under conditions of thermodynamic equilibrium However 

for a steady-state condition the number of collisional excitations !F n - n equals 
the number of collisional de-excitations plus the number of radiative 
de-excitations N n A nn -, i.e., ^ n ' n —SF nn >A-N n A m . where A nn . is the Einstein 
probability coefficient for the radiative transition. Thus the emission per unit 
volume for the forbidden line radiation is 

E nn . = N n A nn ,hv nn ,. (19) 

If Eqs. (17) and (18) and the steady-state condition are introduced into Eq. (19), 
we then obtain a complex expression which contains two very important 
atomic parameters, i.e., Q.(n, ri) and A nn -, whose accuracy can seriously affect 
the T e and N e and hence the chemical abundances in a nebula. 

(c) A fluorescence mechanism due to Bowen (1935) shows that a number of 
strong lines should appear in the near ultraviolet, and these have been observed 
in nitrogen and oxygen in high-excitation planetaries. This effect is due to the 
absorption of the fie II (resonance Ly a line) by O III, raising it to the 2p3d 3 P 2 
level which then either returns to the ground level 2p 2 3 P 2 or cascades through 
the 2p3p and 2p3s levels emitting radiation in the process. A similar cycle 
occurs for N III wherein lines are produced by cascades from 3d 2 D to 3p 2 P 
and from 3p 2 P to 3s 2 S, respectively. Thus the He II resonance Ly a radiation 
provides the source of energy to O III and N III ions, and these lines appear 
only in nebulae with a strong He II A4686 line. The Bowen fluorescence is 
identified by the conspicuous intensities of the lines in the ultraviolet region 
of O III which arise by cascade from a single upper level. 

In addition to the spectral lines observed in planetary nebulae there also 
exists an underlying continuum apparently due to three processes, namely, 
(1) recombination of ions and electrons to excited levels, (2) free-free transi¬ 
tions in the fields of ions and electrons, and (3) two-quantum emissions from 
the 2s level of hydrogen. 

One of the most difficult problems that still exists in the estimation and 
interpretation of forbidden line intensities [see (b) above] is the calculation of 
the collisional cross-section parameters and to some degree the transition 
probabilities. For the calculation of both parameters accurate atomic wave 
functions are required. 


III. Forbidden Transitions and the Calculation of Atomic Parameters 

Electron temperature T t estimates were originally obtained from the study 
of the relative intensities of the hydrogen recombination spectrum (see Page, 
1935, 1942). However, by this method only order of magnitude estimates can 
be obtained because the relative intensities are insensitive to T e . Likewise, the 
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early predictions of the electron density N E (see Menzel and Aller, 1941; Page 
and Greenstein, 1951) were obtained by estimating the absolute emission rate 
in the hydrogen recombination spectrum. But, in order to use this procedure 
it was necessary to determine the absolute surface brightness and the absolute 
nebular dimensions. Unfortunately, both properties are difficult to determine 
to the high precision required, i.e., to determine the dimensions and N E from 
the surface brightness in H p one must assume that the nebula is uniform. This 
presumes that there are no large density fluctuations, a situation whichis often 
not true. In principle it should be possible to obtain more reliable N E and T e 
values from the relative intensities of the forbidden lines of the various ions. 
For a particular ion the relative intensities are determined from the excitation, 
deactivation, and spontaneous emission rates. Thus, if the excitation and 
deactivation cross sections Q or collision strengths Q and the transition 
probabilities A are known, then the T E and N E may be determined as can be 
seen from Eqs. (17), (18), and (19). Figure 1 shows that the ions giving the 
forbidden transitions contain two metastable levels. 

For ions with sufficiently large transition probabilities from the metastable 
levels, it is possible for the electron deactivation of the levels to be negligible; 
this occurs when the electron density is low. The ratio of the emission rates 
then will be proportional to the ratio of the excitation rates for the two levels, 
and the ratio of the excitation rates will be a function of the T E only. Thus, 
the intensity ratios may be used to determine the T E . Where the N e is high 
enough for the deactivation to be important, the intensity ratio becomes a 
function of the T e and N e , and consequently observations on the relative 
intensities of at least two ions are necessary. However, it is desirable to have 
data on several ions all of which should yield the same value for T E and N e , 
provided it is assumed that the distribution of the ions throughout the nebula 
is homogeneous. Whether or not identical results are obtained will depend 
primarily on the accuracy of the atomic parameters. When the N e is high, the 
electron excitation and deactivation of the metastable levels lead to a Boltz¬ 
mann distribution among the levels of the ion. The intensity ratio for the 
forbidden lines is then a function of the radiative transition probabilities and 
the T E , but not the N E . 

The explicit dependence of the intensity ratio on the T E and N e can be 
obtained by mathematically formulating the steady-state condition referred 
to previously. For an ion with two metastable levels, say *S and 1 D, with the 
3 P being the ground term and referred to hereafter as states “ 3 ”, “2”, and 
“1”, respectively, we have, following the procedure of Aller (1956) for state 3 

NiN E q 13 + N 2 N E q 23 = N 3 [N E q 32 + N E q 31 + A 32 + A 31 ], (20) 


and for state 2 
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N l N e q l2 4- N 3 N e q 32 4- N 3 A 32 — bI 2 N c {q 23 "l" #21) 4* bl 2 A 2l , 
where 

( Xmn\ 

9 „= ? ™^exp(--j 


( 21 ) 


and 


dnm 


8.63 x 10 


-6 


CO 


-Vr, 




are the activation and deactivation coefficients, respectively. The quantities 
jVj, N 2 , and N 3 represent the number of atoms in each of the respective levels, 
the A’s the transition probabilities, the co’s the statistical weights of the levels 
the Xmn s the difference in the excitation potential between states m and n, 
E the kinetic energy of the free electron, and Q nm the collision strength. 

The deviation from thermodynamic equilibrium of states 1, 2, and 3, may 
be expressed as follows: 


and 


N 2 b 2 ib 2 ( Xu\ 

wr^M-kT) 

^3 _ 63 CO 3 /_ X23 \ 

N 2 b 2 d> 2 P \ kTj 


( 22 ) 


(23) 


The ratios b 3 /b 2 and b 2 lb x represent the deviation from thermodynamic 
equilibrium, which have been rigorously developed by Menzel et al. (1941) 
and are 


, , ^23 exp (~X 2 JkT e ) 

1 ' —:—“—~—r • 


L 2l 


(0 1 


^12 0 + ^23/^13) ^^12(1 + ^23/^13) 


b 2 j (^32 + -^31)^3 


«i 3 ai 4- ^23/^13) 


f^23 + 


A-i') co 


32^3 

c 


exp(-x 23 /^) 

Q i2 (l 4- ^23/^13) 


(24) 


and 


h. 

br 


Q 12 4- n i3 dexp(-x 23 /l<T e ) 


where 

and 


(f^i2 4- A 21 <x) 2 /C ) 4- f^23(t — d) exp( —X23 /kT^) 
C= 8.63 x \0~ 6 NJTl /2 
d = 


(25) 


^32 4 * {A 32 /C)cb 3 


Q i3 4 - Q 2 3 4 - (A 32 + A 31 )d) 3 /C 
In order to get some idea as to which of the atomic parameters, i.e., Q mn or 
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A mn , predominates, let us consider a high density and a low density nebula. 
If then N e is allowed to increase, then C increases and in the limit b 3 /b 2 -+l 
so that 



the ratio of the intensities becomes 

/ H = «3V32^32 exp 

7 2 i cb 2 v 21 A 21 

where the v’s are transition values in wave numbers. Similarly if N e decreases 
then C decreases and in the limit 



hi 

hi 



/_ %23 \ ( Ql2 

V kTj\Cl 13 


1 + 



+ exp 



-l 


(27) 


It can be seen that in Eq. (26) transition probabilities play an important part 
in the determination of T t , whereas in Eq. (27) it is collision strengths that 
predominate, since the contribution from A 3 JA 32 is small. A very thorough 
exposition of such calculations has been given by Aller (1954, 1956), Aller 
et al. (1949), Shortly et al. (1941), Osterbrock (1955), Seaton (1958,1959,1960, 
1962), Seaton and Osterbrock (1957), Garstang (1951, 1952, 1956, 1962), 
Czyzak and Krueger (1963), and Krueger and Czyzak(1965). However, what is 
more important is that in order to determine the T e and N e accurately, precise 
values of the collision strengths Q and the transition probabilities A are neces¬ 
sary. For the calculation of both of these parameters it is necessary to use 
atomic wave functions, and the accuracy of these wave functions indirectly 
plays a significant role in the determination of not only the T e and N e but also 
the estimation of the chemical abundances. 

The dependence of the T e and the N e on the transition probabilities and the 
collision strengths may be illustrated in the following way. Let us consider 
two nebulae, one of high density (NGC II 4997) and the other of low density 
(NGC 6720). If we employ Eqs. (26) and (27), respectively, we shall be able 
to determine the importance of accurate transition probability and collision 
strength values. It must be emphasized that the numerical values of the T e 
will be only of the right order of magnitude since it would depend on how well 
a high or low density nebula approaches the limiting conditions that were 
imposed to obtain Eqs. (26) and (27). For observed nebulae this limiting 
condition has never been obtained. However, there exist nebulae to which the 
low density approximation may be applied to obtain rough estimates; although 
the high density approximation is poor for nebulae, it is more useful in getting 
estimates of the electron temperature in the shells of novae. The purpose in 
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obtaining these equations is only to indicate the importance of the A and Q 
parameters. In Table 1 the values are shown for the transition probabilities 

TABLE 1 

Transition Probabilities 


Ion 

Transition Symbol A(A) 

^(M) 

A <Q> 


Remarks 

Values calculated by Garstang (1951) 





OIII 

1 D 2 - 1 So A 32 

4363 

0- 

1-60 

1-600 

For the electric-quad- 


3 Pi — 1 D 2 A 2i 

4959 

00071 

0-0 5 52 

0-028 

rupole moment s q = 


3 P 2 - l D 2 

5007 

0021 

0-0* 41 


ejr 2 P(?il;r)P(n l ;r) dr 


3 P i — ‘So A 3 1 

2321 

0-230 

0- 

0-237 

Garstang used HFSCF 


} P 2 - 1 S 0 

2332 

0- 

0-0071 


wave functions with 
exchange which were 
used to obtain the A {Q) 
component of A mn . 

Values calculated by Pasternack (1940) 





O III 

1 D 2 — 1 So A 32 

4363 

0-0 

2-80 

2-80 

For the electric-quad- 


3 p 2 - 1 d 2 a 21 

4959 

0-0056 

0-0* 86 

00217 

rupole moment Paster¬ 


3 p 2 - 1 d 2 

5007 

0-016 

0-0* 57 


nack used hydrogenic 


3 P i — ‘So An 

2321 

0-190 

0- 

0-200 

wave functions which 


3 Pi- l S 0 

2332 

0- 

0-001 


were used to obtain 
the A iQ) component of 

A mn • 


of O III as determined by Pasternack (1940) using hydrogenic wave functions 
and those determined by Garstang (1951) using Hartree-Fock self-consistent 
field wave functions with exchange. If the A values of Garstang and Pasternack 
are introduced into Eq. (26) for the high density nebula NGCII4997, whose 


intensity ratio from observational data for 


I 21 (A 4959 +A 5007) 


= 9.2, the elec- 


I 32 (A 4363) 

tron temperatures T t turn out to be 6890° and 5890°K, respectively. This 
represents a difference of approximately 15%, which is sizable. In a similar 
manner a comparison of the T E 's for a low density nebula may be made. In 
Table 2 we have the appropriate collision strengths required by Eq. (27) as 
determined by Hebb and Menzel (1940), Seaton (1953), and Billings et al. 
(1965). It will be observed there is a sizable discrepancy between the results 
of Hebb and Menzel, those of Seaton, and the very recent values obtained by 
Seaton and his co-workers. The major difficulty in the Coulomb wave approxi¬ 
mation as developed by Hebb and Menzel is that it violates the conservation 
theorem due to Mott et al. (1933) which states that £ Q/(2 J + 1) < (2/ + 1), where 
(2 J+ 1) is the statistical weight of the lower level from which the collision 
excitation takes place and / is the azimuthal quantum number of the partial 
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TABLE 2 


Collision Strengths for O III 



Seaton 

(1953) 

Billings et al. 
(1965) 

Hebb and Menzel 
(1940) 

a 12 

1-73 

2-24 

1908 

a 13 

0195 

0-29 

3-37 

a , 2 /Q 13 

8-87 

7-38 

5-66 


wave that contribute the largest share to the cross section. Hence, the cross- 
section values obtained by this method may be in error by as much as one or 
two orders of magnitude. The variation between the Seaton (1953) results and 
his latest data (Billings et al., 1965) is due to the availability of better wave 
functions and the further development of theory and mathematical techniques. 
The values of T e obtained using the three sets of £2’s in Eq. (27) are 14,300°K 
(Hebb and Menzel, 1940), 17,500°K (Seaton, 1953), and 16,000°K (Billings 
et al., 1965). Here as in the case of the high density nebula the difference is 
also sizable. Thus, if one has a T e which is in error then one also obtains a 
large error in the N e which in turn gives an equally large error in the calculation 
of the ionic abundances, as can be seen from the following expression for the 
abundance of O III; i.e., for N( 0 ++ ) we get 


N(0 ++ ) = 3.7 x 10 


_ 6 6 4 (r t )exp(- x /fer g ) 

r 3/2 


N 


1 + 9380 


Tf 2 

N, 


7(iVi + N 2 ) 


KHp) 


(28) 


where bfT t ) is a temperature dependent factor that denotes the departure of 
the assembly from thermodynamic equilibrium, I(N 1 + JV 2 ) is the intensity of 
X 5007 and X 4959 forbidden lines, and is the intensity of the hydrogen 
X 4863 line. The seriousness of such discrepancies becomes even more apparent 
when Ne IV is considered, an ion often found in nebulae. It has an ionization 
potential of 7.3 eV, and if we use the electron temperatures of 16,000° and 
17,500°K, a difference in ion concentration of 33% is obtained. It is quite 
apparent from the foregoing discussion that the atomic parameters A and £2 
play an important role in the study of the atomic processes that occur in a 
nebula. 

In summarizing, it may be stated that all that can be deduced about a nebula 
or a star is obtained from the light that the object emits. From the light that 
is received we can obtain information as to its direction, amount, and type, 
i.e., its color and spectrum. As can be seen from our foregoing discussion the 
structure of a nebula or the atmosphere surrounding a star depends to a large 
extent on the interpretation of its spectrum. To make such an assessment it is 
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necessary to determine from the spectra the observed wavelengths of the 
elements in the various stages of ionization and also the transition probabilities 
for these lines. For the calculation of the latter a knowledge of the matrix 
components of the electric moments is required, and these components are 
very sensitive to small errors in wave functions. This is due to the fact that in a 
large number of cases they are of the form of integrals, over r, of functions 
which are almost as often positive as negative, and the small difference which 
represents the whole effect can be affected greatly by small changes in the wave 
functions. Likewise in collision cross-section calculations the atomic wave 
functions are important. Since the wave functions represent the electron con¬ 
figuration that the colliding electron would observe, the changes in the elec¬ 
tronic density affect the form of the wave function for the scattered electron 
and hence the collision strength. 

It is for these reasons that an intensive effort has been made in the past 
decade to obtain accurate atomic wave functions and wave functions best 
suited for this purpose are the Hartree-Fock type. For a comprehensive 
exposition of the theory the reader is referred to the textbooks by Hartree 
(1958) and Slater (1960). In general two basic procedures for calculating 
Hartree-Fock Self-Consistent Field (HFSCF) wave functions are used, namely, 
one employing analytical techniques and the other using numerical ones. The 
HFSCF method of calculating the wave functions still gives the best one- 
electron representation of a many-electron atomic configuration. Within the 
past few years many of these functions have appeared in the literature, and 
more recently programs for calculating them have become available. Iterative 
programs have been developed and also calculation of various atomic wave 
functions have been made by Douglas (1954), Piper (1961), Froese (1957, 

1958,1963a,b), Mayers and Hirsh (1965), Czyzak (1962), Herman and Skillman 
(1963), Clementi (1962, 1963a,b), Krueger et al. (1965, 1966), and Chapman 
and Clarke (1966). Analytical programs have been developed by Boys and Price 
(1954), Nesbet (1955), Nesbet and Watson (1960), Roothaan (1960), and 
Watson and Freeman (1961). 

While some of the methods may be regarded as superior to others, further 
work is still necessary since the majority of the programs do not take into 
account configuration interaction. Those that do have been used for the light 
elements only and have given more accurate results. However, very recently 
Froese (1965), as well as Mayers and O’Brien (1965) have further developed 
their programs to include configuration interaction. This should improve the 
accuracy of the wave functions. Still lacking is a program for a completely 
relativistic treatment of the wave functions, which is somewhat urgently 
needed for calculating accurate atomic parameters of the heavier elements of 
astrophysical interest, i.e., those elements of Z > 38. 

It is our pleasure to honor Professor Slater on his 65th birthday .by showing 
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in this paper how the application of the quantum theory of atomic structure 
plays an important role in the calculation of atomic processes occurring in 
planetary nebulae. While our emphasis here has been on the importance of 
accurate atomic wave functions, this represents but a part of Professor Slater’s 
contribution to the quantum theory of matter. For, as is well known, Professor 
Slater’s contributions in this area are many and outstanding. His work spans a 
period of approximately four decades during which he has continuously been 
on the forefront in development and exposition of quantum theory for atoms, 
molecules, and solids. 
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Spin-Orbit Interaction In Self-Consistent Fields 


E. U. CONDON and H. ODABASI 


JOINT INSTITUTE FOR LABORATORY ASTROPHYSICS 
BOULDER, COLORADO 


Although much effort has gone into the calculation of self-consistent fields 
by the Hartree, Hartree-Fock, and Hartree-Fock-Slater methods (7) since 
the advent of automatic digital computers, little attention has thus far been 
paid to the use of such solutions for fields and radial functions to calculate 
spin-orbit interaction parameters. This paper is a report on some calculations 
we are making on this subject. 

In the older accounts (2) the magnetic spin-orbit interaction is represented 
by a term in the Hamiltonian 



0 ) 


the sum being over the N electrons in the atom or ion. Here L ; and S ( - are 
the orbital and spin-angular momenta of each electron in units of h, and £0"i) 
is commonly taken as 


in which -e 2 U(r t ) is the potential energy at distance r t from the nucleus of 
the effective central field in which the /th electron moves. 

Measuring energy in Rydbergs, e 2 /2a, and length in atomic units, 
a= .f 1 2 j me 2 ^ coefficient in front of the sum in Eq. (1) becomes a where 
a = e 2 jhc, the fine structure constant. That is, 


O') 


£ a 2 £(r f )L r S,. 


Calculation of matrix elements of tf 1 in the SLM s M l scheme of zero-order 
states leads to (2, p. 196) 


(ySLM s M L \J>f ‘ 1 \ySLM s M L ) = C(y SL)M s M l 

= X f! ,i m si m u 


( 3 ) 
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in which (2, p. 122) 

Cni = J 0 °° cc 2 i{r)Ph{r) dr (4) 


and P nl {r ) is written for r times the radial wave function as in Ref. (7). 

In Eq. (3) the sum of m si and m H over the states of electrons in a closed 
shell gives zero, so the sum can be restricted to the electrons not in closed 
shells. 

When hydrogenic radial functions are used in Eq. (4), the integral can be 
evaluated exactly to give 


C n( = 


a 2 Z 4 


n l{l 4- + 1) 


(/* 0 ). 


( 5 ) 


In the pre-quantum-mechanical period much of the empirical analysis of 
spin-orbit interaction was expressed in terms of a semiempirical formula due 
to Lande (5) which can be obtained from Eq. (5) by replacing n by n* in 
which n* is the effective total quantum number calculated from the observed 
term energy by E nl = —Zljln* 2 and by replacing Z 4 by Z|Zq in which 
Z 0 = (Z — N + 1) and Zj is an empirically adjusted effective nuclear charge 
for the inner part of the orbital. 

For Russell-Saunders terms the spin-orbit factors C(y.SL) can be expressed 
in terms of the ( n/ for electrons outside closed shells in the configuration y. 
The contribution of spin-orbit interaction energy 3/P 1 to the energy levels 
is then (2, p. 194) 

E\ySLJ) = ^{ySL)[J{J+ 1) - L(L + 1) - S(S + 1)]. (6) 

This result gives the theoretical basis of the Lande interval rule according to 
which inside the same term E(y, S, L, J) — E(ySL, J— 1) = £(ySL)J, and so is 
proportional to the higher of the two J values involved. This result affords 
the basis for estimation of empirical values of the C(ySL) from observed 
spectra, from which one obtains empirical values of the by using the 
theoretical connections between the ((ySL) and the individual 

In the simplest case of doublet spectra due to a single electron outside of 
closed shells the doublet interval gives the ( n/ directly, 

E(y 2 L l + l/2 ) - E(y%_ 1/2 ) = (/ + i)C„,. (7) 

In other cases the CiySL) are expressed in terms of the ( nl . Examples of such 
relations are given in Chap. VII of (2) and a more complete collection of 
them is given by Edlen (4). 

Before extensive calculations of self-consistent fields became possible, little 
use could be made of Eq. (4) to compute theoretical values of the ( ni because 
of lack of knowledge of the U(r ) and of the radial wave functions P nl (r). 
Now an abundance of such material is available, especially through the work 
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of Herman and Skillman (5). They have calculated the effective central fields 
and the radial wave functions for the ground-state configurations of all the 
neutral atoms from 2 He to 10 3 L\v. They have also calculated the involved 
in the ground states of the elements of even Z. Dependence on Z is sufficiently 
smooth that those for odd Z can be found by interpolation. 

In the Herman and Skillman work the same radial potential energy func¬ 
tion V(r) is used for each electron. This is defined as 

V(r) = - 2(Z - N+ 1 )jr, r > r 0 , (8) 

= V 0 (r), r^r 0 , 


where r 0 is defined as that value of r for which these two expressions are 
equal, a procedure originally introduced by Latter ( 6 ), and 


2Z 2 
V 0 (r)= -+- 

r r 


E °>nl f P&S) ds + E °>nl J 

nl J 0 nl J r 




ds — 




(9) 


Here co n[ is the number of electrons occupying the nl orbital in the ground 
state so that 

I>„ = w (10) 

and p{r ) is the total charge density 

p(r) = (4nr 2 )' 1 E u>mPin( r )- 0 0 


The term in Eq. (9) involving p(r) is Slater’s effective potential energy cor¬ 
rection for the exchange terms of the Hartree-Fock equations (/, Vol. 2, 
pp. 10-14). Wave functions based on the use of Eq. (9) in (8) are customarily 
referred to as the Hartree-Fock-Slater approximation. 

Herman and Skillman use the V(r) so defined in determining the energy 
parameters E nl and the radial wave function P ni (r) for each type of occupied 
orbital in the ground states of the neutral elements. Their same computer 
program can also be used for the ground-state configuration of various stages 
of ionization for which N <Z. 

For a given (Z, N ) one can also calculate the radial wave functions for 
excited states. Strictly speaking, this ought to involve a complete recalculation 
of the P nl (r ) for all of the electrons, because excitation of one of them alters 
V 0 (r) which alters each P nl (r). However, experience has shown, as was first 
pointed out to us by R. N. Zare, that the gain in accuracy is too small to 
justify the considerable extra computing effort. Therefore, we use the V{r) 
as found by a self-consistent solution for the ground state also to determine 
the P„i(r) for excited states. This approximation has the added advantage that 
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the P nl (r ) for excited states that are so determined form an orthonormal set 
with that of the ground state. 

Whether fully justifiable or not, what we have done for excited states of 
the (Z, TV) ion is to use the V(r ) determined by Eq. (8) using the occupation 
numbers co nl that are appropriate to the ground configuration of that ion. 

We consider next the question of what U(r) should be used in Eq. (2) for 
calculation of the by (4). Herman and Skillman simply identify U(r ) with 
V(r ) to obtain the ( nl which they tabulate on pages 2-6 through 2-16 of their 
book. We followed the same procedure in calculating C 2p , C 3 P in the N= 3(Li) 
isoelectronic sequence, and C 3p , C 3 P in the 7V=11 (Na) sequence, and also 
for ( 3d for both of these sequences. 

But there is considerable question as to whether this procedure represents 
a good approximation because V(r) includes all N of the electrons and even 
includes the p l/3 term which Slater introduced to represent the effective 
central field of the exchange energy. 

Instead of starting with Eq. (1) as is usually done, it seems more appro¬ 
priate to follow the discussion given by Slater (7, Vol. 2, Chap. 24), which 
leads to the conclusion (his Eqs. 24-18) that the spin-orbit interaction energy 
is a sum over the 7=1,... , N electrons of 


a 2 S,- 


Zr,- , (f fj 

-3 i-i „3 


i J -> 


x ( —7V,), 


( 12 ) 


in which the prime on calls for the omission of the 7 = j term. The quantity 
in brackets is the electric field at the 7th electron due to the combined in¬ 
fluence of the nucleus and the other (TV— 1) electrons. 

The contribution of closed shells to this is spherically symmetrical. We 
may also make the usual spherical average approximation for the other 
electrons. Accordingly the field at r { , due to the nucleus and the other 
electrons, is Zf { (r { )frf in which 


Z/,=Z — 


Z' <°nt f P nl( S ) ds, 

nl J 0 


(13) 


since, from elementary electrostatics, the field at r t due to a spherical charge 
distribution is determined by the total charge within a sphere of radius r { . 

This treatment leads to an altered spin-orbit interaction term, in place of 
Eq. (1), 




* r i 


(14) 


This is close to the usual form but with the £„/ to be calculated by 



« 2 Z /( (r) 

„3 


p ni(f) dr, 


(15) 
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instead of (4). Eq. (15) has a more secure theoretical basis than (4) when U is 
replaced by V 0 , and so we have also calculated some values of this way. 

We have calculated a value of £„, for several values of ( nl ) and for several 
isoelectronic sequences of low values of N, both by Eq. (4) and by Eq. (15). 
We have also made a systematic collection of the corresponding “observed” 
values of these ( nl ’s by calculating them from the doublet intervals. In the 
case of optical spectra the observed intervals are taken from the compilation 
of Moore (7), and in the case of X-ray spectra from the compilation of 
Sandstrom ( 8 ). 

The results of this work are shown in Tables 1-5 and also in Figs. 1-7. 
In Table 1, the observed and calculated values of £ 2p in Rydbergs are given 
for the isoelectronic sequences N=3, 5, and 9. In the jV=3(Li) sequence, 
the P 2 p (r) is an excited state in the configuration ls 2 2p, the ground state 
being ls 2 2s. So, as already mentioned, the P 2p (r) is the one obtained by 
solving for P 2p (r) using the potential function V(r) derived from the self- 
consistent field for the ls 2 2s configuration. 

In the N = 5(B) sequence, the P 2p (r) refers to the ground configuration, 
ls 2 2s 2 2p and in the N= 9 sequence, it also refers to the ground configuration 
which is ls 2 2s 2 2p 5 , giving an inverted 2 P ground term. In Table 1 the values 
as calculated both by Eq. (4) and by (15) are given, showing that (15) gives 
somewhat better agreement with observed values than (4), although the differ¬ 
ence of the two methods is rather small. For N= 5 and 9 we calculated the £„, 
values only on the basis of Eq. (15). 

In every case the £„, is smaller than the hydrogenic value (5) because of 
the screening by the other electrons. As the hydrogenic values increase with 
Z 4 , we found it convenient to exhibit the numerical results graphically by 
plotting £„,/Z 4 against Z, choosing the hydrogenic value for the top of the 
figure in each case. Figure 1 is such a plot for ( 2p /Z 4 against Z on log-log 
scales, squares showing values inferred from observed intervals. The curves 
connect points of the same isoelectronic sequence. 

The general trend is as expected: for constant N increasing Z gives approach 
toward the hydrogenic value, and for constant Z increasing N gives a decrease 
of ( 2p as more electrons produce more screening. Figure 1 includes the 
observed values for the N=6 and 7 sequences showing clearly how they fit 
in with the others, but we did not calculate theoretical £ 2p for them. 

Table 2 and Fig. 2 are similar presentations of the results for £ 3p . Here 
the doublet in question for N= 3, 5, and 9 corresponds to excited terms. 
Observations are lacking for N=9 where the C 3p would have to be inferred 
from the complex 2p 4 3p structure. For the N= ll(Na) sequence we calcu¬ 
lated £ 3p by both methods, and again we see that Eq. (15) gives better agreement 
than (4). Also here we see that the agreement of the calculated and observed 
values is considerably better than in the other instances covered so far. 
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Fig. 1 . £ 2 p/Z 4 (Rydbergs) plotted against Z (logarithmic scales). Top of figure is 
hydrogenic value, a 2 /24. Squares are calculated, circles observed values for optical spectra, 
by number of electrons, N, in isoelectronic sequences. Triangles are calculated values and 
crosses are observed values of the L ,l -L ,,, doublet interval due to the (2p) 5 configuration 
in X-ray spectra for which N = Z — 1. 

Table 3 and Fig. 3 give a similar presentation of the results for £ 3d . 
Here the £’s are considerably smaller than for the p-orbitals because of their 
less penetrating character. Also the “observed” values cannot be directly 
inferred from the doublet intervals for the low values of Z in the sequences 
because some of these are actually inverted, due to the additional contribution 
to the doublet interval arising from higher-order perturbations such as were 
studied by Phillips (9). Even where the configuration-interaction effect is not 
large enough to invert the 2 D, its effect may still produce an appreciable effect 
on the doublet interval which is not considered in these comparisons. Here we 
see that there is quite close agreement of observed and calculated values in 
the N= 3 and 5 sequences, but less good agreement for the N = 11 sequence. 
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Fig. 2. £ 3p /Z 4 (Rydbergs, plotted against Z (logarithmic scales). Top of figure is hydro- 

genic value, a 2 /81. Points designated as in Fig. 1 except that X-ray values refer to the 
Mu-Mni doublet interval of the (3p) 5 configuration. 

Table 4 and Fig. 4 provide a similar presentation of the results for ( 4p . 
Here we made calculations for the N- 11 sequence, using Eqs. (4) and (15). 
We did not make any calculations for ( 4d , but Fig. 5 gives a presentation of 
the observed values indicating that they show general trends similar to the 
other cases already considered. 

Now we turn to a brief discussion of the corresponding doublet intervals 
in X-ray spectra as also shown in Figs. 1-5 and in Table 5 and Figs. 6 and 7. 
In X-ray spectra the doublet arises from an electron being removed from a 
normally closed shell in an atom from which one-electron has been removed, 
so that N — (Z— 1). Thus the (L u -L ul ) interval arises from the open (2p) 5 
shell in atoms in which the 2p shell is normally filled. Similarly the (M n - M m ) 
interval arises from the open (3p) 5 shell in atoms in which the 3p shell is 
normally filled, and the (M iy - M y ) arises from the open (3d) 9 shell, in atoms 
in which the 3d shell is normally filled. 
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TABLE 2b 
t 3p (in Rydbergs) 





N= 11 


z 


Observed 

Calculated 




(4) 

(15) 

11 

Na 

0.0001044694 

0.00011910 

0.00011287 

12 

Mg 

0.0005561761 

0.00064770 

0.00061754 

13 

A1 

0.001412463 

— 

0.00158117 

14 

Si 

0.0027963722 

0.00323256 

0.00310858 

15 

P 

0.0048272808 

0.00553160 

0.00533591 

16 

S 

0.0076728614 

0.00870418 

0.00841872 

17 

Cl 

0.011481954 

0.01292626 

0.01253175 

18 

Ar 

0.016512143 

— 

0.01786591 

19 

K 

0.022878856 

— 

0.02462716 

20 

Ca 

0.030867624 

0.0338976 

0.03304006 

21 

Sc 

0.040642472 

0.04440932 

0.04334373 

22 

Ti 

0.052610435 

0.05709084 

0.05579154 

23 

V 

0.067312196 

0.07222646 

0.0706555 

24 

Cr 

0.084565502 

0.0900834 

0.0882197 

25 

Mn 

0.104491856 

0.11099286 

0.10878599 

26 

Fe 

0.127395014 

0.13524558 

0.13267676 

27 

Co 

0.15418624 

0.16319978 

0.16021989 

28 

Ni 

0.184683282 

0.19521012 

0.19177413 

29 

Cu 

0.219311398 

0.23160832 

0.22770043 

30 

Zn 

— 

— 

0.26837353 


The range of Z-values in the figures covers that in which the respective 
X-ray terms first make their appearance through filling of the shells. Here 
there is a considerable scatter in the X-ray observed values because the 
doublet intervals are quite small. The crosses show observed values and the 
triangles show the calculated values given by Herman and Skillman for 
even Z, supplemented by our calculations of the values for odd Z, by the 
same method. Making allowance for the scatter in the observed values, 
agreement of observation and calculation in all these cases is quite good. 

Table 5 and Figs. 6 and 7 bring out a more detailed comparison of the 
transition between a particular doublet interval in optical spectra for the 
particular case of ( 2p - The comparison is between the doublet interval due 
to (2p) 5 2 P in the N= 9 optical spectra and the L u —L m interval due to 
(2p) 5 2 P in the N=Z — 1 X-ray spectra for Z > 10. In the N =9 sequence the 
configuration remains as ls 2 2s 2 2p 5 with increasing Z. In the X-ray spectra 
for Z = 10 we have the same configuration, but for large values of Z we have 
an increasing number of outer electrons added to this configuration. These 
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TABLE 3b 


£ 3d (in Rydbergs) 





A— 11 


z 


Observed 

Calculated 




(4) 

05) 

11 

Na 

-0.000000181 

0.00000016 

0.00000016 

12 

Mg 

0.000003645 

— 

0.00000352 

13 

A1 

-0.000008311 

0.00002548 

0.00002207 

14 

Si 

— 

— 

0.00007680 

15 

P 

+ 0.000040825 

0.00021686 

0.00019266 

16 

S 

0.000116642 

0.00044114 

0.00039622 

17 

Cl 

0.0002660897 

0.00078868 

0.00071599 

18 

Ar 

0.0005504048 

0.00129314 

0.00118294 

19 

K 

0.0010643589 

— 

0.00183076 

20 

Ca 

0.0015090568 

— 

0.00269633 

21 

Sc 

0.00225994014 

— 

0.00381951 

22 

Ti 

0.00324410767 

0.00559484 

0.00524336 

23 

V 

0.0044834297 

0.00746142 

0.00701386 

24 

Cr 

0.0062330608 

0.00972088 

0.00918058 

25 

Mn 

0.0079826919 

0.01246656 

0.01179568 

26 

Fe 

0.010351984 

0.01570152 

0.01491490 

27 

Co 

0.013231585 

0.01953884 

0.01859687 

28 

Ni 

0.016767298 

0.02401522 

0.02290380 

29 

Cu 

0.0211413758 

0.02916054 

0.02700952 

30 

Zn 

— 

— 

0.03365597 



TABLE 4 




(in Rydbergs) 


N= 11 

Z 

Element 

Observed Calculated (4) 

Calculated (15) 

ii 

Na 

0.0000342029 

0.00003966 

0.00003759 

12 

Mg 

0.0001852908 

0.00021982 

0.00020958 

13 

A1 

0.0004867984 

— 

0.00055007 

14 

Si 

0.0009829525 

0.00114856 

0.00110441 

15 

P 

0.0017253307 

0.00199928 

0.00192845 

16 

S 

0.0027763243 

— 

0.00308599 

17 

Cl 

0.0041796742 

— 

0.00464797 

18 

Ar 

0.0060568825 

— 

0.00669253 

19 

K 

0.0087542305 

— 

0.00930448 

20 

Ca 

0.011573081 

0.01290282 

0.01257543 

21 

Sc 

0.015370023 

0.01701310 

0.01660404 

22 

Ti 

0.020290861 

0.02199574 

0.02149372 

23 

V 

0.026548222 

0.0279600 

0.02735821 

24 

Cr 

0.032623330 

— 

0.03431330 

25 

Mn 

0.040946228 

0.04335066 

0.04248527 

26 

Fe 

0.048965370 

0.05301320 

0.05200558 

27 

Co 

0.0612737089 

0.06419056 

0.06301278 

28 

Ni 

— 

0.07701256 

0.07565116 

29 

Cu 

— 

0.09162308 

0.09007627 

30 

Zn 

— 

— 

0.10643764 
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Fig. 3. £ 3d /Z 4 (Rydbergs) plotted against Z (logarithmic scales). Top of figure is hydro- 

genic value, a 2 /405, Points are designated as in Fig. 1 except that X-ray values refer to the 
A/iv — M v doublet intervals. 

are 3s, 3s 2 for Na and Mg, then a gradual filling of the 3p shell from A1 to Ar 
and the addition of 4s and 4s 2 for Z= 19 and 20. 

Hence the distinction between £ 2p for a given Z is, for the X-ray case, the 
additional screening produced by these “outer” electrons to the small extent 
that they penetrate to radial values within the 2p-orbital, as compared with 
the N = 9 sequence for which no outer electrons are added to the ls 2 2s 2 2p 5 
configuration. This effect means that £ 2p should be smaller for the X-ray 
sequence than for the N = 9 sequence. The calculated values are shown in 
Table 5, as also a column of differences [C 2p (A^=9)-£ 2p (A=Z- 1)] in the 
column headed A. 

In Fig. 6 the corresponding £ 2p are shown, the squares referring to 
calculated values for the N=9 sequence, and the triangles to the smaller 
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Fig. 4. £ 4p /Z 4 (Rydbergs) plotted against Z. Top of figure is hydrogenic value, a 2 /192. 
Points are designated as in Fig. 1 except that X-ray values refer to the Nu-Nm doublet 
interval. 


calculated values for the C 2p for the X-ray intervals, while the crosses show 
the ( 2p inferred from the X-ray levels. These latter despite scatter tend to be 
systematically smaller than the calculated ( 2p . Figure 7 shows a graph of the 
difference A against Z. 

In conclusion it seems fair to say that the calculations show that the self- 
consistent fields are capable of giving quite good values of the spin-orbit 
parameters both with regard to absolute values and general trends with N 
and Z. Some discrepancies remain which can be reduced by the use of 
more elaborate approximations (10 ,11), but this work shows that the spin- 
orbit parameters are quite adequately given for most purposes by calculations 
of the type considered here. 
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Fig. 5. Same as preceding figures, but for £ 4d /Z 4 . Top of figure is hydrogenic value, 
a 2 /960. Points refer to 4 2 D intervals in optical spectra and Nw-Ny (4d) 5 intervals in 
X-ray spectra. 
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TABLE 5 
£z P (in Rydbergs) 


z 



X-ray 


N = 9 

A 

(15) 

Observed 

(4) 

(15) 

(15) 

10 

Ne 

0.00669868 

0.00574064 

0.0061954 

0.0061954 

0.00000000 

11 

Na 

0.01004802 

0.0098288 

0.00945871 

0.01046786 

0.00100915 

12 

Mg 

0.01339736 

0.0156159 

0.0150942 

0.0166128 

0.00151860 

13 

A1 

0.01808644 

0.0236047 

0.022897 

0.0251047 

0.00220770 

14 

Si 

0.02679472 

0.0342703 

0.0333372 

0.0364729 

0.00313570 

15 

P 

0.04019208 

0.0481425 

0.0469410 

0.0512990 

0.0043580 

16 

S 

0.06028812 

0.065802 

0.0642856 

0.0702184 

0.00593280 

17 

Cl 

0.07368548 

0.0879598 

0.0859962 

0.0939203 

0.00792410 

18 

Ar 

0.13397360 

0.1150476 

0.1127460 

0.1231470 

0.01041010 

19 

K 

0.14737096 

0.1481618 

0.1453785 

0.1586913 

0.01331280 

20 

Ca 

0.18086436 

0.1880368 

0.1847128 

0.20140271 

0.01668991 



Fig. 6. Effect of screening on doublet 2 2 P interval as shown in value of £ 2p . Squares 
are calculated values for the ls 2 2s 2 2p 5 (N = 9) ground states of the fluorine isoelectronic 
sequence. Triangles are calculated values for the N = (Z- 1) (2p) 5 X-ray interval in 
elements Z = 10 to 20. Crosses are observed X-ray doublet intervals. 
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Fig. 7. Calculated difference between £ 2p in N = 9 isoelectronic sequence and in 
calculated X-ray doublet interval due to (2p) 5 with A = Z — 1. 
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I. Introduction 

In 1951 in the course of a discussion at the first post-war scientific confer¬ 
ence in West Berlin, Professor Carl Ramsauer remarked how interesting it 
would be if it were possible to repeat the experiments on the scattering of slow 
electrons by atoms with the electrons replaced by positrons. One’s first 
reponse to this is apt to be that, because of its positive charge, a positron is 
repelled by the mean atomic field and such phenomena as the Ramsauer- 
Townsend effect which produces a minimum total cross section for collisions 
of electrons with argon, krypton, and xenon atoms at energies of the order of 
0.5 eV, would not be expected to occur. Monotonous behavior of the positron- 
atom cross sections as a function of electron energy would be anticipated. 

However, this is based on a misconception. It seems that the Ramsauer- 
Townsend effect for electron collisons arises principally because of the long- 
range polarization of the atom by the incoming slow electron. As long ago as 
1929, Holtsmark found that the effect could be reproduced theoretically by 
calculating the scattering from the Hartree field of argon modified by addition 
of an empirical polarization potential having approximately the correct asymp¬ 
totic form. He found a similar result for krypton (Holtsmark, 1930). In the 
last few years, improved methods of calculation of the scattering of slow 
electrons by atoms have been developed and it is found that, if allowance is 
made for scattering by the undisturbed mean static field of the atom, by the 
nonlocal interaction due to electron exchange and by a dipole polarization 
field due to disturbance of the atom then good agreement with observation is 
found down to the lowest electron energies. Such calculations for argon, 
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carried out by Thompson (1966), show that the Ramsauer-Townsend effect 
only appears when the polarization field is introduced in addition to the static 
and exchange interactions. 

Turning now to the scattering of slow positrons we note that, while the mean 
static atomic field is now repulsive the polarization field should be essentially 
the same as for slow electrons. In view of the importance of the latter field for 
slow-electron scattering we must admit the possibility that it will be dominant 
also for the scattering of slow positrons. There may in fact exist effects similar 
to the Ramsauer-Townsend effect for some atoms. Before discussing this 
point further we must also point out that, although in the positron case there 
is no nonlocal interaction due to exchange, such an interaction is present 
through the possibility of virtual or real positronium production. 

Calculations already carried out for the scattering of slow positrons by 
hydrogen atoms have confirmed the importance of polarization. Thus varia¬ 
tional calculations using elaborate trial functions (Schwartz, 1961) have shown 
that the zero-energy scattering length for positrons is negative, corresponding 
to an attractive potential. 

The possibility of carrying out experiments with slow positrons as mentioned 
by Ramsauer is now quite close. This is because of the growing availablity of 
intense positron sources generated from the high-current electron beams which 
have been accelerated in linear accelerators. It should not be long before both 
total and differential cross sections for elastic scattering of slow positrons of 
well-defined energy, will be observed. Apart from such direct observations 
indirect evidence about the cross sections may be obtained from observation 
of the annihilation spectra of positrons in gases. 

A technique of this kind which has been especially fruitful is that of 
Garth Jones and his collaborators (Falk et al., 1965; Jones et ai, 1965). They 
allow positrons diffusing in the gas to come to equilibrium in the presence of an 
electric field. The mean energy and energy distribution of the positrons is 
then determined by the field and by the momentum loss cross section for 
positron collisions with the gas atoms. For positrons of a given velocity v 
the annihilation cross section from gas atoms is given by 

Q a = frrlc/v ( 1 ) 

where r 0 is the classical electron radius e 2 /mc 2 . £ is an effective number of 
atomic electrons and is given by (Ore, 1949) 

^ \p(r)\F{x)\ 2 dr (2) 

where p(r) is the number density of electrons in the atom at a distance r from 
the nucleus and F( r) is the wave function of the motion of the positron refer¬ 
red to the nucleus as origin, which has the asymptotic form 
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F(r) ~ e ikz + r~ l e ikr f(0). (3) 

k is the positron wave number mv/h. It follows that the mean rate of annihila¬ 
tion of positrons in a gas in the presence of an electric field of strength E is 
given by 

R = (ne 2 N/mc) J/(£, v)£(v) dv, (4) 

where N is the number of gas atoms per unit volume and f(E, r ) is the velocity 
distribution function for the positrons. The mean annihilation rate may be 
observed as a function of E. It depends both through f(E, v ) and £ on the 
wave function describing the collision of a positron of given energy v 
with the atom. To take advantage of this it is best to make some assumptions 
about the effective positron interaction, determine the corresponding wave 
function F and thence / and £ so that R may be obtained and compared with 
observation. This procedure may be used either to check an approximate 
theory or by a trial and error procedure to determine an empirical effective 
positron-atom interaction. Considerable progress in applying the latter tech¬ 
nique has been made by Jones et ah, (1965) for argon and will be referred 
to again below. 

Other even more indirect methods for investigating collision phenomena of 
slow positrons in gases are available. In view of all these sources of experi¬ 
mental information which are either already in operation or will shortly be 
so it is worthwhile to make some preliminary calculations for positron col- 
lisons with rare gas atoms to examine the kind of results to be expected when 
polarization is allowed for. 

In carrying out the calculations which we now present, virtual positronium 
production is not allowed for. This means that the true effective attractive 
field is likely to be somewhat greater than we assume. It is, of course, possible 
to allow for this empirically by increasing the polarizability above the experi¬ 
mental value but in the absence of any control data from experiments there is 
no way of determining how much increase to allow for. Judging by the results 
of the atomic hydrogen calculations (Cody et ah, 1964) virtual positronium 
formation contributes considerably less to the effective attraction for slow 
positrons than does polarization. 

II. Allowance for Polarization in the Theory of Scattering by Rare Gas Atoms 

For argon and krypton we have used the semiempirical interactions which 
Holtsmark found were successful in yielding a good approximation for the 
scattering of slow electrons by atoms. This interaction took the form 

V~(r) = V„(r) + V p (r), 


( 5 ) 
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where V H is the mean interaction with the Hartree charge distribution of the 
atom and V p is a polarization potential with the asymptotic form 

V p ~-a e 2 /r 4 , (6) 

a being the atomic polarizability. We take for positrons 

V + (r)=-V H (r)+V p (r). (7) 

The differential cross section 1(6) dco for scattering through the angle 6 into 
the solid angle d<x> is then given by 

m = m\ 2 ( 8 ) 

where 

f(0) = — ‘I (21 + l)(e 2i ' 1 - l)P,(cos Q) (9) 

ZlK l 


and rji is such that the solution of 


fil 

dr 2 


+ Ik 


/(/ + 1) 2m 


S'-i*- 


( 10 ) 


which vanishes at the origin has the asymptotic form 

g t ~ sin(A:r — \ln + rj t ). (11) 

The total elastic and momentum loss cross section Q, and Q m respectively 
are then given by 


Q t = 2n f 1(0) sin 6 d6, Q m = 2n f (1 — cos 6)1(6) sin 6 dO. (12) 
so •'0 


The phases rj t were determined by electronic computation using the depart¬ 
mental computer in the Physics Department at University College, London. 

For helium and neon a somewhat less empirical procedure was adopted 
based on the exchange adiabatic approximation of Temkin and Lamkin 
(1961). Following this method as formulated for electrons we find a similar 
interaction to (7) but with the Hartree-Fock instead of the Hartree approxima¬ 
tion (the two are of course the same for helium) and a polarization potential 
which is determined as follows. 

Consider for simplicity the case of helium. We write as an approximate 
wave function describing the system of two atomic electrons (coordinates 
r l5 r 2 ) and a free positron (coordinate r 3 ) in the form 

U» U) = r 2 ) + (p 0 (ri, r 2 , r 3 )]F(r 3 ), (13) 

where if/ 0 (r y , r 2 ) is the ground state wave function of the helium atom and F 
is a proper function which has the asymptotic form 

F(r) ~ e ikz + r~ i e ikr f(6), 
k being the wave number of the incident positron. 


( 14 ) 
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The function (f> 0 represents the distortion of the atomic wave function 
through interaction with the incident positron. Following the same procedure 
as that of Temkin and Lamkin (1961) for dealing with slow-electron collisions 
we treat the perturbation due to the slow positron as adiabatic provided 
r 3 > r 1 , r 2 ■ We take the perturbing potential to be 

v = v 13 + v 23> 

where 

v 13 = (2ri/rf)cos0i 3 , r 3 > r lf (15) 

0 13 being the angle between r x and r 3 . In this way we only include the dipole 
distortion of the atom. For r 3 <r l ,r 2 , the perturbing potential is taken to be 
zero. This arbitrary assumption is probably less justified for positron than 
for electron collisions because of the absence of any symmetry relations be¬ 
tween the incident and atomic particles. 

Given v the change <fi in the atomic wave function may be calculated as a 
function of r x , r 2 , and r 3 , using a method due to Stemheimer (1954). 

Having determined <p in this way a differential equation is obtained for 
jF 0 (r 3 ) by requiring that 

J <Ao(G> r 2 )(J? - Ey¥(je lt r 2 , r 3 ) dr l dr 2 = 0, (16) 

Where is the Hamiltonian for the three-particle system and E the total 
energy. This leads to the equation 

[v J +* I -|?(*'i tl ,+ »v]f-o < 17 ) 

where V p is now given in terms of i/'o an< ^ • It h as the asymptotic form 

V p ~ — a.e 2 /r 4 (18) 

but a is now a calculated rather than the observed polarizability. It tends to 
zero for small r as r 2 . As the most important effect of polarization arises from 
the long-range interaction which it introduces it is probable that the way in 
this interaction is cut off at small distances is not very important (cf., 
Lawson et al., 1966). 

The Temkin-Lamkin procedure is consistent and convenient and gives very 
good results in electron-atom collisions. We must remember here, however, 
that in the latter cases it is also necessary to take into account electron ex¬ 
change. The equivalent in the positron case, virtual positronium formation, 
has not been included in the present calculations. 

Extension of the above procedure to neon presents no difficulties if one 
continues to work with the Hartree-Fock approximation (Mertz and Torrey. 
1963) to the ground state wave function. 
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III. Results 

Figures 1-3 illustrate the total elastic and momentum loss cross sections 
for positron collisions with helium, neon, argon and krypton calculated as 
described above. It must be realized that we cannot expect the results to 
represent reality too closely but they should expose the possibilities. 

The cross section for helium is very small and with the approximations made 
exhibits a Ramsauer-Townsend effect at a positron energy of about 1.2 eV. 
For helium the value of the polarizability as calculated by the method out¬ 
lined above is 1.56 a.u. which is a little greater than the observed value of 



Fig. 1. Calculated total and momentum-loss cross sections for collisions of slow posi¬ 
tron with helium and with neon atoms:-total cross section;-momentum-loss 

cross section. 
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Fig. 2. Calculated total and momentum-loss cross sections for collisions of slow 

positrons with argon atoms: -total cross section;-momentum-loss cross 

section;-total cross section derived by Jones et al. (1965). 

1.38 a.u. (Herzfeld and Wolf, 1925) so that the calculation already allows a 
little for a further effective attraction such as that due to virtual positronium 
formation. Nevertheless it is likely that the full attraction is still a little under¬ 
estimated. This means that the cross section minimum will be shifted to rather 
higher positron energies. Indirect evidence (Teutsch and Hughes, 1956) from 
the experiments of Marder et. al., 1956) indicates a very low momentum- 
loss cross section for helium at a mean energy close to the positronium 
formation threshold, about 18 eV. Although this is qualitatively in agree¬ 
ment with expectation from our calculations it is still smaller than we would 
anticipate. 

Neon also presents an interesting situation. Our calculated polarizability is 
2.41 a.u. a little smaller than the observed 2.67 a.u. It is therefore likely that 
the true effective attractive interaction is underestimated by our method. In 
that case it looks likely that a Ramsauer-Townsend effect exists for collisions 
with neon of positrons with quite low energy; in our calculations which some¬ 
what underestimate the attraction the minimum has not quite appeared even 
at zero energy. As for helium the momentum-loss cross section to be expected 
at a mean positron energy of 15 eV is much larger than suggested from the 
analysis (Teutsch and Hughes, 1956) of the experiment of Marder et al. (1956). 
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POSITRON WAVE NUMBER k (au) 


Fig. 3. Calculated total and momentum-losscrosssectionsforcollisonsofslowpositrons 
with krypton atoms:-total cross section;-momentum-loss cross section. 

This gives about 0.12 nal to be contrasted with about LOnal for our calcula¬ 
tions. 

For argon our semiempirical interaction leads to a rather flat and shallow 
minimum in the total cross section at about 1.2 eV. In this case we may 
compare with the cross section derived from an effective interaction found by 
Falk et al. (1965) to give a good fit to their observations on the annihilation 
rate of positrons in argon as a function of time. Considering the simplicity 
of the theoretical approximation the agreement is not perhaps too unsatis¬ 
factory but it leaves unanswered the question as to whether a minimum cross 
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section really exists for argon. The evidence is that conditions are not un¬ 
favorable for this though the minimum may be very shallow or smeared out 
completely. The momentum loss cross section which we obtain at 9 eV is 
l.lnal which is closer to the value 2nal derived by Teutsch et al. (1956) than 
for helium and neon. 

Finally, for krypton, the calculated results show a behavior rather similar 
to that for argon except that the momentum loss cross section differs very 
much more in shape from the total cross section. Thus Q m has a very deep 
minimum at an energy below 0.5 eV which has almost disappeared from 
Q t . This is clearly a sensitive case which it would be of considerable interest to 
investigate experimentally. 

In Figs. 4-7 differential elastic cross sections are shown for a range of 
positron energies, scattered by each rare gas. These angular distributions are 



Fig. 4. Differential cross sections for scattering of slow positrons by helium atoms. 
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highly variable and far from monotonic as would be the case if the mean 
effective interaction were a monotonic repulsion. 

Finally in Table 1 we give the effective values of the number £ of annihilation 

TABLE 1 


Effective Number £ of Annihilation Electrons per Atom for Positrons in 
Helium, Neon, Argon and Krypton, Calculated from (2) 


Positron wave 
number (au) 

0.2 

0.4 

0.6 

0.8 

1.0 

£ helium 

1.22 

1.11 

1.095 

1.11 

1.12 

£ neon 

1.43 

1.53 

1.79 

2.04 

2.38 

£ argon 

2.42 

2.40 

2.70 

3.00 

3.23 

£ krypton 

5.08 

4.77 

2.18 

2.33 

2.36 
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Fig. 6. Differential cross sections for scattering of slow positrons by argon atoms. 


electrons per atom as a function of positron energy for each gas. These have 
been calculated from the formula (2) with F( r) given from the solution of (17). 
No allowance has therefore been made for the distortion of the charge 
distribution in the atom during the collision with the positron. To include 
this effect requires a more accurate solution of the wave equation for the 
three-particle system than we have obtained. In particular we need to know 
more accurately how the function <p 0 in (13) behaves when r 3 is nearly equal 
to rj or r 2 . It follows that the effective number of annihilation electrons, if it 
could be measured accurately as a function of positron energy at low energies, 
would be a more sensitive test of any theory of low energy positron collisions 
with atoms than the differential elastic cross sections. All we can say at this 
stage is that allowance for atomic distortion will tend to increase the values of 
£ above those given in Table 1 because the electron density will be increased 
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Fig. 7. Differential cross sections for scattering of slow positrons by krypton atoms. 

at large distances from the atom where the positron experiences an effective 
attraction. 

From the values of £ given in Table 1 together with the momentum-loss 
cross sections in Figs. 1-3 annihilation rates for positrons in the presence of 
electric fields may be derived and used as a basis for analyzing observed data 
on these rates. 


IV. Summary 

It is clear that, due to the destructive interference between the scattering 
from the mean repulsive atomic field and that due to the attraction arising 
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from atom polarization, the elastic scattering of slow positrons by atoms is 
likely to show features at least as “unusual” as for the corresponding scatter¬ 
ing of slow electrons. Accurate prediction of the behavior in any particular 
case is likely to present a considerable challenge to theorists because of the 
sensitivity arising from the destructive interference which is inherent. 
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I. Introduction 

Slater screening constants and Slater orbitals (Slater, 1930) have enabled a 
whole generation of chemists and physicists to calculate approximate values 
of many of the physical properties of atoms. The use of screening constants 
implies that an atomic orbital corresponding to a given set of electronic 
quantum numbers is the same, except for a uniform scaling of the coordinates, 
quite irrespective of which atom or ion is under consideration. This concept 
of interchangeability of orbitals has greatly simplified our mental picture of 
atomic structure. 

An adequate simple model of a molecular orbital certainly must be more 
complicated than a Slater atomic orbital. The molecular orbitals vary in size 
and shape as the internuclear separations are changed. Nevertheless, as 
chemists we feel confident that it must be possible to develop the concept of 
interchangeability of molecular orbitals on some sort of corresponding states 
basis. We are impressed with the simple regularities which characterize most 
molecules: the additivity of bond energies; the accurate reproducibility of 
bond lengths, bond angles, and bond force constants, etc. 

As our theoretical treatments and mathematical experimentation give pro¬ 
gressively closer analogs to natural phenomena, it is evident that two types of 

* This research was supported by the following grant: National Aeronautics and Space 
Administration Grant NsG-275-62. 
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theories will emerge. First, due to the development of computers capable of 
remembering and manipulating very great detail, there will be the highly 
complex formulations representing numerical models of physical systems which 
can be assimilated only by the high-speed computing machines which generated 
them. And second, there will be the simplified semiempirical formulations 
which the human mind can comprehend and manipulate in a sophisticated 
conceptual or analytical manner. The computing machines will provide specific 
answers to specific questions in much the same manner as the results of labora¬ 
tory experiments. However, there is a great need for the output of the com¬ 
puting machines describing these “ numeric systems ” to be in an accessible 
and usable form. These simplified representations will help the scientist to 
see the important features of the model and make it easier for him to develop a 
simple concept or understanding of the phenomena. The scientist will then 
make quantitative definitions of the important features and use the numerical 
output of the computing machines to interpolate and extrapolate the changes 
of these important features which occur when the “ experimental ” conditions 
are changed. 

In seeking a simplified formulation of the behavior of a class of systems, we 
inevitably try to develop a corresponding states treatment in which all of the 
systems obey the same equations when each system is characterized by a set 
of parameters. Thus, the aerodynamical behavior of geometrically similar 
objects moving through different media can be characterized by Reynolds, 
Prandtl, and Schmidt numbers. The volumetric behavior of a gas or liquid 
can be characterized by the critical parameters: P c , V c , and T c . In chemistry, 
a compound is characterized by its atomic composition and by its chemical 
bonds. In quantum chemistry, we further characterize a compound by its 
molecular orbitals. Underlying each of these uses of corresponding states is 
the inherent notion of the interchangeability of systems of a particular class. 
This interchangeability is certainly only an approximation, but it may be useful 
in helping us to comprehend and to predict the behavior of complex systems. 
In the present paper, let us consider on a simple basis, first, how Slater screening 
constants relate to the interchangeability of atomic orbitals and second, their 
role in molecular orbitals. 


II. Atomic Orbitals 

The notion of screening constants stems from a well-known theorem of 
classical electrostatics. Suppose that an atom is composed of a nucleus of 
charge + Ze surrounded by a spherically symmetric cloud of electrons having 
a charge density —ep(r). Then the electrostatic potential V(r) at a distance r 
from the nucleus is 


V(r) = +e(Z- S(r))/r = + eZ eU (r)jr. 


0 ) 


Slater Screening Constants in Atomic and Molecular Orbitals 219 


Here the screening constant S(r) — 4 n \ pr~ 2 dr' is the number of electrons 

lying within a sphere of radius/-. The effective nuclear charge Z t[[ {r) = Z — S{r). 
It is a remarkable fact that V(r) and Z e([ (r) are unaffected by that part of the 
spherically symmetric electronic charge cloud which lies outside of the sphere 
of radius r. If the electrons were distributed in a set of spherical shells, then 
for electrons in the nth shell the electrostatic potential would be V — e(Z — S n )/r 
where S n is the screening constant. If there are N k electrons in the kth shell, 
then 

S n =^N k + ^N n -l). (2) 

fc=i 

The reason why each of the other electrons in the nth. shell is only half effective 
in its shielding of a particular electron is that (if the shell has finite thickness) 
it is equally probable that an arbitrary other electron has a radius greater than 
the radius of the particular electron under consideration. 

In real atoms, the atomic orbitals are not highly localized in a spherical 
shell. Thus the concept of a screening constant for each type of orbital repre¬ 
sents an approximation. And, indeed, for the expectation value of each dif¬ 
ferent type of property, we should use a different value for the screening 
constant. 

Historically, already in 1921 Schrodinger (1921) suggested that the orbits 
of the semiclassical Bohr quantum mechanics be divided into segments. In 
each of these segments, the ellipsoidal trajectories were approximated by 
assuming a coulomb potential with an effective nuclear charge characteristic 
of this segment. Schrodinger’s concept of atomic structure was a crude be¬ 
ginning of the Hartree atom. In 1927, Pauling (1927; Pauling and Sherman, 
1932) used Schrodinger’s procedure to estimate (rather accurately) the molar 
refractivity, the diamagnetic susceptibility, and the sizes of various atoms. 

In 1930, Slater (1930) made the use of screening constants simple and 
practical. It would have seemed logical to use hydrogenic orbitals (Naqvi, 
1962, 1964; Naqvi and Victor, 1964). However, with hydrogen-like orbitals, 
many of the integrals required for the estimation of atomic properties are quite 
difficult to evaluate. Slater was impressed by Zener’s (1930) work on analytical 
Hartree wave functions. Zener found that, as far as energy is concerned, the 
nodes in the orbitals are quite unimportant. Thus, Slater (1930) simplified the 
Zener wave functions to obtain the familiar Slater orbitals: 

R(r) = r n ~ 1 exp( - (Z - S)r/n*). (3) 

Here the screening constant S and the effective principal quantum number 
are embedded parameters which Slater determined so as to give good values 
for the X-ray energy levels of atoms, atomic and ionic radii, etc. Most of the 
Slater orbital integrals required for the determination of atomic properties 
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are simple. For example, the mean value of the kih power of the radius of an 
electron is given by the relation 



(4) 


Much has been written elsewhere on the many different types of applications 
of Slater screening constants (Hirschfelder et al., 1964). For some properties, 
such as the ionization potentials and the atomic radii, they give excellent values. 
For other properties, they give only fairly good approximations. 

For really high accuracy it is necessary to have different screening constants 
for different properties. For example, consider the expectation value of a 
property which varies as r k . If k is large, those portions of configuration space 
where r is large (and the screening by the other electrons is large) must be 
given the most weight. Thus, properties which vary as r k require large screening 
constants if k is large and small screening constants if k is small. On this 
account, the energy screening constant (corresponding to k = — 1) should be 
less than the diamagnetic susceptibility screening constants (corresponding 
to k = 2). With the use of perturbation theory and hypervirial theorems 
(Sanders and Hirschfelder, 1965; Robinson, 1965) we are now able to calculate 
good values of the screening constants appropriate to a particular property. 
In the perturbation theory calculations, the screening constant is adjusted so 
as to make the first-order correction to the expectation value vanish. Table 1 
shows a comparison between the Slater screening constant values, the per¬ 
turbation theory values (calculated with hydrogenic orbitals), and the exact 
expectation values for a number of one electron operators. 


TABLE 1 


Expectation Values of One-Electron Operators for the Helium 
Ground State. 0 All Values Are Given in Atomic Units 


Slater orbital 

Operator with S = 0.30 Perturbation theory 


Exact 



0.882 

1.038 

2.694 

13.05 

5.78 

4.913 


0.923 (S = 0.375) 
1.170 (S = 0.398) 
3.745 (. S= 0.434) 
23.63 (S = 0.460) 
5.977 (S'= 0.271) 
5.616 (S'= 0.223) 


0.929 

1.192 

3.944 


6.017 

5.688 


Sanders and Hirschfelder (1965). 
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III. Molecular Orbitals 

Slater-type orbitals (STO’s) have formed the building blocks for a vast 
number of calculations on molecules (Allen and Karo, 1960). These have 
ranged from very exact treatments on H 2 to semiempirical 7 r-electron calcu¬ 
lations. The simpler of these approaches uses at most one Slater orbital on 
each atom of the molecule to represent each molecular orbital. The more 
arduous and sometimes more refined treatments express molecular orbitals as 
a linear combination of many STO’s on all centers of the molecular system. 
The MO’s obtained by the former treatments may be conveniently called 
minimal basis set MO’s and by the latter extended basis set MO’s. 

Due to their feasibility and often their qualitative success minimal basis set 
calculations have been and are extensively pursued, particularly for large 
systems (more than two centers). It is thus useful now that extended STO basis 
set MO’s very near the Hartree-Fock solutions are available for many diatomic 
molecules (Nesbet, 1962; Kahalas and Nesbet, 1963; McLean, 1963; Wahl, 
1964; Huo, 1965; Wahl et al., 1966; Cade et al., 1966) to present some com¬ 
parisons between the minimal STO basis set MO’s and the extended STO basis 
set MO’s. 

In Table 2 we have compared total energy, binding energy, ionization 
potentials, and (where relevant) dipole moments. Extensive comparisons of 
this sort, which also involve quadrupole moments and field gradients, are 
given in the papers cited. 

Several points have become clear. One is that the minimal basis set provides 
a poor and unreliable quantitative representation of the molecular orbital 
(see, for example, dipole moment behavior). Second, the Hartree-Fock values 
of one-electron properties are quite good and definite while small basis set 
properties can oscillate widely with change in basis set composition. Third, 
we obviously must go beyond the Hartree-Fock model to describe chemical 
binding in a nonempirical manner (Nesbet, 1965). Perhaps the most compelling 
reason for pursuing the exhaustive and expensive calculations necessary to 
obtain HF solutions is that they provide us with good one-electron properties 
and a solid and consistent platform from which we can build the improve¬ 
ments necessary to adequately describe molecules (Das and Wahl, 1966; 
Gilbert, 1965). Last, it is encouraging that the molecular Hartree-Fock wave 
function seems to be attainable with an extended but manageable basis set 
of STO’s, derived from atomic SCF calculations. 

Although not sufficient as an accurate description of charge distributions 
and not suitable when crudely used for predicting molecular properties, the 
Slater screening constants and the single Slater atomic orbitals seem to provide 
a rough measure for the size of many molecular orbitals. Although, of course, 
there are other molecular orbitals which are so distorted that they bear little 
relation to the atomic orbitals. 
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A following Chapter by Wahl in this book presents the electron density 
of very nearly the Hartree-Fock molecular orbitals of the homonuclear 
diatomic molecules in the first row of the periodic table. As a step in para¬ 
meterization of these orbitals (since we now have them we should try to put 
them in a simpler form), it is interesting to see how accurately we can estimate 
some general features of these molecular Hartree-Fock orbitals with a very 
unsophisticated use of Slater screening constants. First of all, we can consider 
the distance from the nucleus at which a 2s or 2p atomic orbital has a maxi¬ 
mum charge density. According to Slater orbitals this should occur at 
r max = (Z-S)~ 1 a 0 . If the molecular orbitals were the same as the atomic 
orbitals in the separated atoms, and if neglect of overlap is justifiable, then 
the values of r max measured toward the outside of the molecule for the 2a g , 
2a u , 1 tz u , 3 o g , and l 7 r g Hartree-Fock orbitals should all be comparable to the 
(Z - S) ~ l a 0 for the separated atoms. Table 3 shows this comparison. Whereas 

TABLE 3 

Comparison of r max for Hartree-Fock Molecular Orbitals with r max for 

Slater Atomic Orbitals 0 

^max(^o) 


Minimal 
STO 
atom 
(Z-S )~ 1 

Li 2 1.54 

B 2 0.77 

C 2 0.65 

N 2 0.51 

0 2 0.44 

F 2 0.38 


Extended basis MO’s 


2<A 

N> 

Q 

c 

1 TTu 


2.3 




1.7 

0.90 

0.80 


1.0 

0.75 

0.65 


0.80 

0.65 

0.50 

0.50 

0.75 

0.60 

0.45 

0.45 

0.70 

0.57 

0.35 

0.35 


fl Here r max is the distance from the nucleus to the maximum electron 
density in the outer loop of the orbital. The Hartree-Fock molecular orbitals 
are given in the preceding paper. 

our simple Slater screening constant prediction of the r max is excellent for the 
1 tt u , 3 o g , and ln g , it is very poor for the 2a g and the 2a u . 

The next question which we can ask is how well do the Slater screening 
constants with Slater orbitals predict the position of the outermost electron 
density contour as shown in the preceding paper. This contour corresponds to 
an electron density in the orbital of 6.1 x 10- 5 eao 3 - Table 3 shows this com¬ 
parison. The agreement is excellent for the la g and la u orbitals which, except 

for H 2 , really “ look like ” atomic cores. 

For the other orbitals the outer perimeter predicted by nonoverlapping 













224 


J. O. HIRSCHFELDER AND A. C. WAHL 


W 

X 

< 

H 


Pi 

B 


oi 

w 

pu, 

g 

o 


w 

I 

H 


b 


o 

S 


cd 

X) 

H3 

D 

•O 

c 

<D 

-4—* 

X 

W 


b 

CN 


b° 

CN 


CN 


Cd q ^ 

c X q. 

.£ cd CN 

| o 

S H 

oo 


on 


X3 
<D 
*0 
C 

<D c /5 


o 

S 


on 

X 


£ 

o 

c3 

O 

H 

c/3 


cn 


oo 

cn 


ON 

cn 




q 

co 


cn 

vn 


O 




OO 

co 


^ 9 ^ ^ „ o 

{o ^ n ri ffj X rn 


CN 

NO 


on in cn cn oo iT) oo (N r- 
ri rn rn m* n cn cn cN 


q 

cn 



lO 

r- 

ON 

CN 

vn 

q 

q 

q 

cn 

q 

q 

r-' 


cn 

cn 

cn 

CN 


>-N y—s W-1 »0 (N CN 

CN (S ih F-i in X X X X 


q q on on 
~ © o' 


/*-\ '-\ /•-\ >0 'O IO y—V ^-s 

°o n n io q cn q q cn © © on on 

ri Tf (N (N ^ ^ rt' ^ X X X X o © 

n*._ y s._ ✓ s*._v ✓ v N"*' S " B ' / 


O O o 

q cn q CN i—i 

X CN X X 4—i* 


ON 

ON 


o 

ON 


PQ 


Uh 


X 2 £ 

S c /5 ~ 

•=; TO *h 
*£ C ,0 

O o 


a c 

vs 

P > 


o M 

o <u 

Li 2 ^ 

S2« 

iSH ^ 

<» «• -g 

CD I H 

x 


o 

c3 


O «n 
Lh 1 

u o 
73 
£ 


o 3 
<3 c 


<u 

Si 

on 

<U 

3 

13 

> 

Lh 

<U 

X 


V 

X 


X 

cd 


<d 

X c/3 


u 

a 


—< o 

NO £ 

C/5 ^ 


3 

O 

<U 


■>^ i—i 

X <D 

•3 9" 

c cd 

^ g a 

’S ^ hn 


X 

'I 


T3 
<U 

r> o 

vi Jd <D 
cd o 


<D 0) 

x x 
o.S 
? ! « 


X) 

Lh 

O 

Lh 

cd 


o 

-9 

’o 

£ 

x 


° o 

3 

a 

Lh 
<D 

■4—* 

c 

<u 
X 

00 
c 
o 
13 

T3 

Lh 

cd 

£ 

3 

O 

C/5 

3 

JD 

73 

3 
C 
<D 
X 


«-e 

on o 

3 


o no cd 

^ ^ X 


<D 

X 


ti <D 

e3 -O 

X 


P O *2 


E x 

O cd 

£ u, 

T5 ^ 
U 

S § 
S £ 

U s 

C C 

•« 


^ c 

X a> 
^ X 
O cd 

L-; ^ 

<U # on 

73 


cd 

on 

a> 

C/5 

<u 

X 


£ 

2 

X 

>4—* 

Cd 

a 

*o 

c 

n 


c 

c 

5 

<D 

X 



a 

a 

L« 


c/5 

Lh 

Lh 

a* 

O 

<u 

X 

c 

<D 

^3 

0^ 

a 

3 

O 

o 

X 

C/5 

Id 

> 

on 

3 

3 

a> 

X 

o 

Lh 

3 

V 

X 

3 

3 

-4—* 

Chh 

Lh 

3 

O 


C 

o 

c 

73 

£ 

c 

o 

o 

C/5 

Id 

-4—* 

O 

X 

-4—* 

o 

on 

E 

■4—* 

X 

£ 

‘C 

cd 

<D 

a 

C/5 

o 

c 

Lh 

O 

o 

X 

a, 

Lh 

q 

Lh 


£ 

4> 

Lh 

<D 

jd 

X 

<D 

o 

3 

■HH 

3 

Lh 

U 

O 

3 

O 

o 

3 

C/5 


<u 

O 


cd 


X 

3 

<D 


X 

X 

p 

q 











Slater Screening Constants in Atomic and Molecular Orbitals 225 


single STO’s with atomic screening constants is quite poor. The 2 o g Hartree- 
Fock MO perimeter shows a “ pulling in ” along the molecular axis relative to 
the simple Slater atomic orbital while the 2 o u Hartree-Fock perimeter has 
moved out relative to the Slater atomic orbital. The ln u , 3 o g , and ln g orbitals 
all have a much larger perimeter than the corresponding Slater atomic orbital. 
However, the charge along this perimeter is extremely small, and it is only 
as they reflect more significant shifts in charge that these observations are 
important. The differences displayed in Tables 3 and 4 arise from: (1) neglect 
of overlap, (2) the HF MO’s form an orthonormal set while the Slater AO’s 
are only normalized, (3) the inadequacies of a single STO in representing any 
orbital, atomic or molecular, (4) the Slater AO has been “ frozen ” in the 
molecule and not allowed to distort through the variational procedure. Since 
these are just the four consequences of the usual assumptions made in the 
most simple use of single STO’s in molecules, these comparisons may be 
instructive. 


IV. Conclusion 

Certainly these are crude observations and a detailed analysis of these MO’s 
is needed, but we feel that some Slater-type parameterization of these accurate 
MO’s can provide us with molecular building blocks and molecular screening 
parameters useful for proceeding to larger systems and estimating molecular 
properties just as Slater screening constants have enabled us to think about 
and represent atoms adequately for many purposes. 
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I. Introduction 

The development during the last decade of large digital computers has 
stimulated the quest for accurate Hartree-Fock self-consistent-field wave 
functions for small polyatomic molecules. The solution of the Hartree-Fock 
equations by expanding the molecular orbitals as a linear combination of 
atomic orbitals has been elegantly formulated by Roothaan (1951). The 
problem is thus reduced to that of choosing an adequate set of basis functions. 
This is complicated by the need to evaluate the many multicenter molecular 
integrals that arise in the course of the computation. 

The use of Gaussian-type functions was proposed about fifteen years ago 
by Boys (1950) and some early molecular calculations were performed by 
Meckler (1953) and Nesbet (1960). The disadvantage of the Gaussian functions 
in contrast to the Slater orbitals lies in the slow convergence of the expansion 
due to their poor behavior both at the nucleus and in the tail. The main 
advantage is that formulae for all multicenter integrals are available and 
readily adapted to high-speed computation. The essence of the Gaussian 
approximation is thus the transformation of the integral evaluation problem 


* This research was supported in part by the Air Force Cambridge Research Laboratories, 
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to one of data processing. In order to surmount this difficulty a set of programs 
was developed at M.I.T. principally by Harrison (1963) in collaboration with 
Sutcliffe, Csizmadia, and Moskowitz to perform nonempirical calculations 
for small molecules in a Gaussian basis. 

The structure of the programs differs somewhat from the one usually 
adopted. First, the programs were designed to be as open ended as possible 
and so were written as a set of independent Fortran chain links. Second, 
generality was preferred to efficiency. That is, where there was a choice 
between a general-purpose program and a special-purpose one, the former 
was chosen. Third, simplicity was preferred to efficiency, so that all except 
the most basic programs were written in Fortran and these were heavily 
subroutinized to preserve readability. These principles were adopted not 
because of indifference to the economics of such calculations but because the 
authors felt that much work in this field has been duplicated because of unduly 
specialized programs and excessive pride in writing programs which no 
one else can understand. 

Previous papers in this series have reported Hartree-Fock wave functions 
for numerous molecular systems including HF (Harrison, 1964), ethylene 
(Moskowitz and Harrison, 1965a), water (Moskowitz and Harrison, 1965b), 
formyl fluoride (Csizmadia et al., 1966), acetylene (Moskowitz, 1965), and 
benzene (Schulman and Moskowitz, 1965) to name a few. Extensive calcula¬ 
tions in a Gaussian basis have also recently been performed by Allen and 
Whitten (1965), Burnelle (1965), Huzinaga (1965), Krauss (1963, 1965), 
and Reeves (1963). 

Since our original work on HF, Roothaan and Cade (1965) at Chicago 
have computed extremely accurate self-consistent-field functions for diatomic 
molecules in a Slater basis. They have also suggested the desirability of ap¬ 
proaching the self-consistent-field limit in order to assure convergence of 
one-electron properties computed from the molecular wave function. Under 
the stimulus of the Chicago effort we have reexamined the HF molecule in 
order to answer the following questions: (1) How accurate a result can one 
obtain in a Gaussian basis, (2) What are the effects of polarization functions; 
i.e., d-orbitals on fluorine and p-orbitals on the hydrogen, (3) How well 
does the Gaussian basis handle one-electron molecular properties ? 

The calculated values of total energy, dipole moment (p), field gradient at 
the nucleus (< q ), the molecular quadrupole moment ( Q ), and <r 2 > for several 
Slater and Gaussian bases are presented in Table 1. The notation (95/32) stand 
for nine s-like, five p-like Gaussians on fluorine and three s-like, two p-like 
on hydrogen. In the same manner (952/32) indicates the addition of two 
d-like functions for both sigma and pi symmetry on the fluorine, etc. The 
exponents used included those obtained by Huzinaga (1965) for the fluorine 
atom, and by Reeves (1963) for the hydrogen atom. The exponents for the 
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TABLE 1 

Energy, Dipole Moment (/l), Field Gradient at the Nucleus (<?), Molecular 
Quadrupole Moment ( Q ), and <r 2 > of HF (/? = 1.7328 au) for Several Slater 

and Gaussian Bases ' 1 


Basis 

Energy 

P 

Qzz 

Qzz 

Qlz 

<r 2 y 

Clementi 6 

-100.0580 

0.78 

0.597 

3.501 

_ 

_ 

Nesbet 6 

-100.0571 

0.79 

0.582 

2.778 

— 

— 

(95/32) 

-100.0319 

0.923 

0.586 

3.131 

1.876 

13.524 

(952/3) 

-100.0411 

0.757 

0.579 

2.990 

1.945 

13.539 

(952/32) 

-100.0489 

0.751 

0.570 

2.896 

1.853 

13.566 

(1062/42) 

-100.0622 

0.756 

0.565 

2.897 

1.888 

13.718 

Chicago 

-100.0703 

0.762 

0.540 

2.869 

1.884 

— 

Exptl 

-100.1325 

0.68 

0.513 

— 

— 

— 


0 All properties in atomic units. 

6 Recomputed by P. E. Cade, private communication. 


polarization functions were estimated from previous work with Slater 
functions.* 


II. Results 

The results show that the energy of the largest Gaussian basis (1062/42) 
is lower than the Slater calculations of Nesbet (1962) and Clementi (1962), 
and only about a 0.01 of an atomic unit from the Hartree-Fock limit as 
estimated from the Chicago computation. It should be remembered that the 
Slater basis results represent an extensive optimization of the nonlinear 
parameters. In the Gaussian case no reoptimization of the orbital exponents 
taken from atomic calculations has been attempted. Part of the discrepancy 
in the total energy is undoubtedly due to this constraint on the wave function. 
The remainder of the error probably arises both from the inherently poor 
representation of the wave function in the vicinity of the nuclei and the lack 
of f-orbitals as polarization functions. 

The one-electron properties in the Gaussian basis have evidently converged 
to the Hartree-Fock limit. Further, the effect of changing from the (952/32) 
basis to the (1062/42) basis has a neglible effect on the one-electron properties, 
though lowering the energy noticeably. The results also suggest that to a 
first approximation the d-functions on the fluorine have a greater effect than 
the hydrogen p-orbitals on one-electron properties. This is quite encouraging 
for future work on larger polyatomic systems. 

* The Gaussian exponents on fluorine were: d„, 2.5, 0.5; d n , 1.5, 0.3. The exponents on 
hydrogen were: both P a and P n , 1.5, 0.3. 
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In summary we conclude that the Gaussian basis is acceptable for the 
representation of accurate Hartree-Fock wave functions. Indeed, further 
experience should lead to the choice of basis sets which, although giving 
energies inferior to the Hartree-Fock values, will give adequate values of the 
important one-electron properties. 
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I. Introduction 

In attempting to understand the energetics and geometry of molecular 
formation, it is usual to deal with orbital approximations to molecular wave 
functions. The most prominent orbital methods, because of their conceptual 
simplicity, are the approximations based on the assignment of electrons either 
to molecular orbitals (MO’s), or to atomic orbitals (AO’s) of the atoms 
which have been combined to form the molecule. The AO method includes the 
Heitler-London and valence bond methods for dealing with interatomically 
paired electrons, but also includes ion-pair bonding and other situations using 
AO’s. The MO method can be presented in various forms; the most typical 
is that which uses completely delocalized MO’s conforming to representations 
of the symmetry group of the whole molecule. These MO s can be called 
spectroscopic MO's, because it is they which are most directly needed in 
dealing with spectroscopic excitation and with ionization. On the other hand, 
localized MO’s, in particular bond MO’s, are very useful in understanding 
various characteristics of chemical bonds; they, together with lone pair AO’s, 
can be called chemical orbitals. 

* This work was assisted by the Office of Naval Research, Physics Branch under Contract 
Nonr-2121(01), with the University of Chicago, and by a contract between the Division of 
Biology and Medicine, U.S. Atomic Energy Commission and the Florida State University. 
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Although orbitals have no substantial existence, and one can argue that 
they are merely mathematical conveniences, they have been and still are 
useful as conceptual units whose characteristics are worth examining. The 
present paper is limited mainly to a survey of the bonding properties of spectro¬ 
scopic molecular orbitals in diatomic molecules. 

The term “bond strength” is commonly used very loosely. Usually it 
seems to mean “bond energy”, i.e., energy of formation of a bond (Pauling, 
1960). However, Mulliken (1931, 1932) has distinguished the “energy bonding 
power” and the “distance bonding power” of molecular orbitals. The total 
bond energy, or the equilibrium value of the intermolecular distance R, 
respectively, are then regarded as determined by a summation of effects from 
electrons in individual MO’s. For diatomic valence-shell MO’s, as first 
emphasized by Lennard-Jones (1929), expressions of LCAO form generally 
furnish good approximations. For diatomic hydrides, however, UAO 
(united-atom AO) forms are perhaps usually better than LCAO forms even 
for valence-shell MO’s. 


II. Empirical Criteria for MO Bonding Powers 

The usual criteria (Mulliken, 1928, as modified by inclusion of the category 
of antibonding electrons: Herzberg, 1929, 1931; also Mulliken, 1931, 1932) 
for a bonding, antibonding, or nonbonding MO for a neutral molecule in its 
normal state N can be stated in terms of what happens (1) to the dissociation 
energy D or (2) to R e and/or the related quantity co e (or to the corresponding 
force constant k), when an electron is removed from the MO under consider¬ 
ation. If (1) the change AD is negative or if (2) A R e is positive and/or A co e 
is negative when an electron is removed from it, the MO is adjudged bonding; 
if the opposite changes occur, it is considered antibonding; if little or no 
changes occur, it is nonbonding. Simple examples are the \a g MO of H 2 
(strongly bonding by either criterion), and the \a u MO of He 2 (strongly 
antibonding by either criterion: although He 2 is not stable, removal of one 
1 <r u electron gives stable He 2 ). Criteria (1) and (2) can be called the thermo¬ 
chemical and the equilibrium MO bonding criteria respectively. 

Additive and subtractive valence-shell LCAO MO’s ( x a ± Xb for homopolar 
or a Xa + PXb an d PXa ~ > with a > /?, for heteropolar molecules), which 

have £ (the “reduced internuclear distance”) near 1 at R e , are in general 
expected from simple LCAO theory to be respectively bonding or. antibonding 
by both the thermochemical and the equilibrium criterion. [The quantity £ is 
defined as the ratio of the internuclear distance R to twice the radius of maxi¬ 
mum radial density for the relevant AO, which may be the AO used in the 
LCAO expression, or, for small <*, may be a UAO (Mulliken, 1932, p. 40).] 
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Actually, however, factors to be discussed below sometimes play havoc with 
the applicability of the thermochemical criterion even for state N. The main 
such factor is the noncrossing rule, according to which the U(R ) curve of any 
molecular state as R -*■ oo may have to make a short cut to the lowest available 
atom-pair state of the proper group-theoretical species, even if (as happens 
not infrequently) this does not correspond to the valence structure of the 
molecular state. For excited states, even those of valence-shell type, such 
dissociation short-cuts are very prevalent; some examples will be discussed 
in a later section. Inner-shell MO’s, which are of LCAOtype with £ 1 when 

R is near R e , are expected to be essentially nonbonding by both criteria. For 
Rydberg MO’s, which at R e are of UAO type with f < 1, thermochemical 
criteria are irrelevant, but by the equilibrium criterion they are nonbonding 
or nearly so. 

Instead of AD, a very direct criterion for bond energy is simply the magni¬ 
tude of D itself, but in general this is a measure of the total effect of all the 
electrons and cannot easily be used as a criterion for individual MO’s. How¬ 
ever, this D criterion can be used in simple cases. In state N of H 2 , it coincides 
with the A D criterion, since D for H^ + is zero. In state N of H 2 with two 
1 c g electrons, D is nearly twice as large at R e as for H 2 with one 1 o g electron 
at its R e , as seems reasonable. The D criterion makes sense also for the lcr u 
state of H 2 + ( D = 0), the 1<7 2 1 <t 2 state of He 2 (£> = 0),and the \o]\o u (D > 0) 
and l< 7 g l( 7 2 (D = 0) states of He 2 , if 1 <r H is antibonding and more strongly so 
than 1 <j g is bonding (a relation which is rather well understood). 

The larger D for state N of M 2 than for that of M 2 , when M is any alkali 
metal, is less easy to explain (Barrow et ai, 1960; Robertson and Barrow, 
1961; Lee and Mahan, 1965). Here the A D criterion (AD < 0) contradicts the 
evidence of the A R e and Ac o e criteria (A R e > 0, Aw e < 0), and of D itself, 
which all support the expectation from its additive form that the valence-shell 
MO is bonding. However, this case is very exceptional. 

For state N of first-row molecules built from atoms with 2p electrons, as 
Herzberg (1929, 1931) first pointed out, D runs parallel to the excess of the 
number of bonding over that of antibonding electrons if one attributes D 
just to those MO’s which can be constructed as LCAO’s from the 2p electrons, 
and the net effect of 2s 2 closed shells on bonding is considered nil. However, 
as SCF MO calculations have shown especially clearly in recent years, matters 
are really less simple in that the 2s as well as the 2p atomic shells actually are 
involved to an important extent in bonding and antibonding, and even the 
Is shells make an appreciable (antibonding) contribution. 

While the empirical D is a good practical measure of bond energy, in general, 
a theoretically more significant D, the intrinsic D, can be obtained if the 
dissociation energy is measured from an asymptote in which the atoms are in 
suitable valence states (see Hinze and Jaffe, 1962 and references given by 
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them).*f For example in N 2 , the two N atoms should be in their F 3 (trivalent) 
states, which for each atom are about 1.1 eV above the ls 2 2s 2 2p 3 , 4 S normal 
state; thus the intrinsic D is 12.0 eV, 2.2 eV larger than the ordinary or net 
D of 9.76 eV. In the N 2 example, promotion to the s 2 p 3 trivalent valence 
state is intraconfigurational. In other cases, for example CH 4 and most other 
carbon compounds, pluvalent configurational promotion (from s 2 p 2 , V 2 to 
sp 3 , V 4 ) is essential to obtain an intrinsic D. 

Another complication is that of hybridization during bond formation; this 
could be deemed to call for the use of a valence state with partial configura¬ 
tional promotion (partial because the hybridization is only partial), which 
would increase the intrinsic D further. For example in N 2 , there is very ap¬ 
preciable isovalent hybridization which would call for partial (about 25%) 
isovalent configurational promotion to a ls 2 2s2p 4 , V 3 valence state (Fraga 
and Mulliken, 1960, and references given by them). However, the construction 
of the energy of a partially promoted valence state to take care of hybridization 
is generally fraught with uncertainty, and in view also of the fact that s, p 
“hybridization” is only one of several forms of modification which atomic 
valence shell AO’s have to undergo during bond formation, it is in the writer’s 
present opinion just as well to omit any allowance for hybridization in obtain¬ 
ing an intrinsic D. [Whether viewed in the context of AO theory or of MO 
theory, the orbitals in the actual molecule really involve MAO’s (modified 
AO’s) instead of free-atom AO’s: MAO’s directly in AO theory, LCMAO’s 
in MO theory. These modifications include scaling, symmetrical distortion, 
and polarization (including hybridization in the usual sense, but also the mixing 
in of higher-shell, higher-/orbital forms). It can be shown that all of these are 
equivalent to using various configurationally promoted states, some with higher 
n only, others with higher/, butall of course with equal A (Mulliken, 1962,1965).] 
Thus in the present discussion the only promotion which will be admitted 
in assessing intrinsic D’s, other than any necessary pluvalent promotion to 
pluvalent valence states, is intraconfigurational promotion to valence states. 

The preceding discussion regarding intrinsic D’s must be qualified in one 
respect. Namely, at large R values near dissociation, every molecular state 
wave function approximates to a single LCAPAS based on state N AS’s of 
the two atoms. [Here “ LCAPAS ” refers to a function which is (in general) 
a linear combination (LC) of over-all antisymmetrized products Ail/ a ip b (AP’s) 

* Different authors give slightly different values for atomic valence state energies, because 
of intrinsic difficulties (due to configuration interaction) in defining them exactly. 

t The term valence “state” is somewhat misleading, as Longuet-Higgins has pointed 
out in conversation, since a valence state wave function cannot be written as a linear 
combination of individual state functions. (The difficulty is that no definite signs —or phase 
relations at all—can be assigned to the terms in such linear combinations.) It is only the 
valence state energy which is a definite linear combination of individual state energies. 


The_Bonding~Gharacteristics of Diatomic MO's 


235 


of atomic substate (AS) strong-field (i.e., characterized by definite M L and 
M s values) wave functions t ft a and \J/ b of the two atoms (Mulliken, 1966). This 
large-R LCAPAS could very properly be called an “ atoms-in-molecule ” 
function. Although the atoms-in-molecule method was developed intensively 
by Moffitt (1951)—see also Ellison (1965) and references given there— atomic- 
state bonding, in contrast to electron-pair Heitler-London bonding, was 
earlier extensively discussed by Nordheim-Poschl (1936).] Only at rather small 
R values does mixing with other LCAPAS’s, both of the same two-atom AO 
configuration and of promoted (including ionic) configurations, come impor¬ 
tantly into play, under the action of valence forces. However, these other 
LCAPAS’s never mix in, even at R e , in quite as large proportions as would 
correspond to an ideal valence state. In other words, lower-energy AO state- 
pairs always have larger, and higher-energy ones smaller, weights as compared 
with ideal valence-state proportions. For example in the normal state of N 2 , 
the ( 4 S', 4 5’), *£* LCAPAS must contribute somewhat disproportionately 
relative to the several LCAPAS’s from the pairs ( 2 D, 2 D), ( 2 Z>, 2 P), and 
( 2 P, 2 P ) of the same s 2 p 3 , s 2 p 3 configuration. For similar reasons the C atom, 
whose normal state configuration is capable only of bivalence, cannot be quite 
fully quadrivalent in typical compounds like CH 4 , C0 2 , C 2 H 2 , and so on 
(Van Vleck and Sherman, 1935; Voge, 1936; Kotani and Siga, 1937). 

All in all, it appears that even intrinsic D values (and certainly not ordinary 
net D values when they differ much from intrinsic D values) have in general 
no very simple theoretical meaning. In terms of SCF-MO theory they usually 
do not, because the SCF-MO wave function in the absence of configuration 
mixing ceases for most neutral molecules to be a good approximation as R 
increases during dissociation. In terms of valence-bond theory, defined as that 
part of AO theory which for neutral molecules connects molecular states with 
atom-pair valence states, intrinsic D values do have some meaning. This 
significance is, however, usually only roughly approximate because promotion 
of the atoms is not quite in the ideal proportions required for true valence 
states and, especially, a rather large amount of additional promotion not 
contemplated in valence bond theory, to excited atom-pair and to ion-pair 
states, is present (see the discussion above about the necessity of MAO’s). The 
foregoing treatment, although stated primarily in terms of neutral homopolar 
molecules, can also be extended to heteropolar and to charged molecules. 

Having concluded that neither ordinary net nor intrinsic D values are in 
general useful criteria for the bonding characteristics of individual MO’s, we 
next consider the AD criterion. First we note that for homopolar molecules 

AD = D + — D = [£/ + (oo) - £/ + (P e + ] - [l7(oo) - U(R e )] 


= [£/ + (oo) - U(oo)] - [U + (Rf) - U(RJ] 

d 03 L L t o m ^molecule e^oo^ 


(1) 
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where D + ,U + , R+ refer to the positive ion and D, U, R e to the neutral mole¬ 
cule. It is seen that AD is exactly equal to the difference between / of one of 
the atoms into which the molecule dissociates and I of the molecule. 

ForMO’sofLCMAOform(ax (I + px b ,orx a ± x b forhomopolar molecules), 
it was noted empirically some time ago that when valence-state /’s are used, 
the quantity e A ^1 (and so AD) is positive for valence-shell MO’s of additive 
LCAO form and negative for those of subtractive LCAO form, with magni¬ 
tudes in the neighborhood of 2 or 3 eV if suitable valence-state values 1° 
and /* are used for the AO’s (Mulliken, 1934, 1935). Here the positive or 
negative sign respectively of ^^/(or equally of AD) is found to be indicative 
of bonding or antibonding. For heteropolar molecules, /*, in Eq. (1) is a 
(perhaps weighted) mean of the appropriate /’s of the two atoms. 

However, it should be noted that the e A x I criterion has no real theoretical 
basis, but involves a comparison which corresponds to an MO pseudo- 
correlation. That is, in Eq. (1), although I e corresponds to ionization of the 
molecule in the region of R values where the MO approximation is good, and 
represents a good approximation for an MO term value, I m = / atom is a free- 
atom value corresponding to the use of the AO approximation, which is now 
accurate whereas an SCF-MO function at R = oo, which would be required 
(and can in fact easily be constructed) for a true MO correlation, would have 
much too high an energy. [See Mulliken (1966) for a discussion of MO cor¬ 
relation diagrams, in which it is pointed out that for those valence-shell 
electrons which are usually considered to be bonding electrons the usual MO 
correlations are only pseudo-correlations as R -* oo.] Nevertheless e A^1 
proves empirically to be a rather satisfactory thermochemical criterion in the 
case of valence-shell MO’s. At the same time, it appears from the preceding 
discussion that there are no general theoretically well-based thermochemical 
criteria for the bonding powers of individual MO’s. 

There now remains to be considered the equilibrium (Ai? e or Ac o e ) criterion 
of MO bonding power. As is well known, there exists for different states of 
any one diatomic molecule the qualitatively almost invariant empirical rela¬ 
tion that a state with larger R e has smaller <x>* (see Herzberg, 1950). Hence the 
A R e and Aco e criteria are very nearly equivalent. The A R e and Ac o e criteria 
as applied to molecular states where both the molecule and its ion are stable 
seem to be very satisfactory criteria of MO bonding characteristics in actual 
molecules. Because the operation of these criteria is confined to moderate R 
values near an R e , or else one of the states involved is unstable (D = 0) as in 
the examples of He 2 and H 2 cited at the beginning of this paper, usually no 


* The exceptions are rare and quantitatively trivial. They occur only for pairs of states 
of nearly equal R e ; in such cases the <o,.’s are also always nearly equal, but occasionally a 
slightly larger a> e may accompany a slightly larger R e contrary to the usual rule. 
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difficulty about failure of the SCF-MO approximation to be a good approxi¬ 
mation, such as normally occurs with D and AD criteria, arises here. And 
unlike the latter criteria which fail most conspicuously for some of the Rydberg 
states of molecules, the A R e and Ac o e criteria indicate, as would be expected 
theoretically, that Ryberg MO’s are nearly nonbonding. 

Comparisons among the R e and c o e data on the CH and CH + states listed in 
the following Table 1 are instructive. The normal-state MO configuration of 


TABLE 1 

Low-Energy States of CH and CH + 


State 

Molecular constants 

Dissociation 

Excitation 

energy(eV) 

Re 

(A) 

OJ e 

(cm- 1 ) 

Energy 

Z)(ev) 

Products 6 

(777, i n 

13.56 

1.234 

1865 

0.7 

2 p 

a 2 , *2 + 

10.64 

1.131 

(2867) 

3.6 

2 p 

<777 2 , 2 2 4 

3.94 

1.113 

2824 

0.79 

l D 

< 777 2 , 2 S“ 

3.19 

1.186 

2543 

0.28 

3 P 

CT77 2 , 2 A 

2.87 

1.102 

2921 

1.86 

l D 

CT77 2 , 4 £“ 

[0.6]* 

[1.09]" 

[2970]* 

[2.9]° 

3 P 

<7 2 77, 2 n 

0 

1.120 

2862 

3.47 

3 P 


° Estimated. 

b One product is ls H , the other is the carbon ion or atom state listed, of configuration 
ls 2 2s 2 2p or ls 2 2s 2 2p 2 . 


CH is 1 cr 2 2cr 2 3a 2 1 n, the state being 2 n, while for the first excited MO 
configuration 1 cr 2 2a 2 3 a In 2 there are several states 4 £“, 2 A, 2 I“, and 
2 S + . All of the on 2 states except the 4 ST are well known experimentally. 
The A R e and A co e comparisons of the CH and CH + normal states indicate 
that the 3cr MO is nearly nonbonding. A comparison of R e and a> e values 
between the normal and excited states of CH then indicates that the In MO 
(as would be expected from its form) is also nearly nonbonding. Such A R e , 
A co e intercomparisons between different states of a neutral molecule are 
valuable in showing the relative bonding powers of different MO’s. 

[The CH molecule supplies also one of the rare examples where the A R e and 
A(o e rules fail. Namely, in a comparison of the R e and co e data for the excited 
state of CH + with those for the normal state of CH (cf. Table 1), A R e and 
Aw e would now indicate that 3<r is a bonding MO. Or comparing the l U 
data with the R e , A oo e data for the excited states of CH, A R e and Aco e would 
indicate that In also is a bonding MO. These conclusions contradict the more 
reasonable conclusions (since In surely must be nearly nonbonding) reached 
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above. They serve as a warning that it is an oversimplification to assume that 
the A R e , A a> e rule is infallible. Reasons for the contradiction may perhaps be 
sought in a less nearly united-atom character for the 1 II state than for the 
other states; a detailed analysis would be rather lengthy.] 

The fact that A R e and Aco e are both small when a nonbonding electron is 
removed can be expressed in a concise and slightly generalized form by saying 
that the potential curve near R e for an electronic state containing a nearly 
nonbonding electron runs almost parallel to that of the state of the positive 
ion which results when this electron is removed. However, these curves may 
be expected to diverge increasingly at R values increasingly far from R e . 
Rydberg states are typical examples of states with a nonbonding MO, here 
the Rydberg MO, and it is characteristic for them that the shapes of their 
potential curves near R e are nearly the same as those of the corresponding 
positive ion. Beyond R e , out to R = oo, a rough parallelism of potential curves 
in some cases continues, while in others a sharp divergence occurs (Mulliken, 
1966). 


III. Effects of Dissociation Products on R e and co e 

Returning to a consideration of Table 1, it is of interest to note that the 
potential curves of different states of the on 2 configuration of CH must differ 
strongly at large R values because of differences in the relations of their 
energies at R e to those of their allowable dissociation products. It is notable 
that in spite of the corresponding large differences in their D values (see 
Table 1), all these states have nearly the same R e and c o e . However, R e is 
somewhat larger and a> e smaller depending on how small the dissociation 
energy D is. This effect can reasonably be ascribed to incipient small differences 
in (kind and) amount of minor CM (configuration mixing) near R e (more 
CM when D is smaller) related to differences in LCAPA.S correlation as 
R -+ oo. 

Table 2 for N 2 shows indications of similar effects. In particular, the 3 I+ 
state of the 7r 3 a] n g configuration, whose D is considerably smaller than for 
the other states, shows definitely smaller a) e and a little larger R e than for the 
others. However, the *A U state, in spite of a somewhat smaller D than for the 
state, shows a slightly larger c o e and smaller R e than the latter. Again, 
the x n 9 state of configuration n* <y g n g , in spite of a distinctly larger D, shows 
a slightly smaller co e and larger R e than the 3 IT 9 state of the same configura¬ 
tion. But in the N 2 states cited the relative values of D differ far less than in 
Table 1 for CH. The tentative conclusion seems justified that a> e and R e 
values tend to be nearly the same for different states of any one configuration, 
but (o e tends to be smaller and R e larger for states with exceptionally small D. 
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However, this is not quite the whole story. The so-called V states of 
diatomic molecules have exceptionally large R e and small co e coupled with 
fairly large D, features which are attributable to their predominantly ion-pair 
character in AO theory approximation (Mulliken, 1936, 1939). On the other 
hand, the corresponding T (triplet) states of the same configuration are 
repulsion states. The T and V states of H 2 , well describable at small R values 

TABLE 2 


Some Low-Energy States of N 2 


State 

Molecular constants 3 

Dissociation 

Excitation 

energy(eV) 

Re 

(A) 

a>e 

(cm" 1 ) 

Energy 

D(eV) 

Products* 

7T U OgTT g . 

12.85(?) c 

1.44(?) c 

752(?) c 



'rrlogTTg, X A U 

8.89 

1.26 

1548 

5.57 

2 D+ 2 D 

77 3 Og7T g , 1 S" 

8.40 

1.27 

1530 

6.06 

2 D + 2 D 

TTlOgTTg, 3 Z U ~ 

8.16 

1.28 

1518 

5.12 

*S+ 2 P 

TTlOgTTg, 3 A u 

[7.17]“ 

[1.28]" 

[1510]" 

[4.94] c 

*S+ 2 D 

TrlOgTTg , 3 2 u + 

6.17 

1.29 

1460 

3.59 

*s + *s 

7T U <Jg7Tgy i- J- g 

8.55 

1.22 

1694 

5.91 

2 D + 2 D 

TT^OgTTg, 3 II, 

7.35 

1.21 

1734 

4.76 

*S + 2 D 

Trial, 1 Z^ + 

0.00 

1.09 

2358 

9.76 

*s+*s 


0 See Wilkinson, P. G., and Mulliken, R. S. (1959). /. Chem. Phys. 31, 674; and Wilkinson, 
P. G. (1960). J. Chem. Phys. 32, 1061, for recent data on the 'A u , 1 S”, and 3 H U “ states. 

» All from the normal two-atom configuration ls|2s|2pllsf2s?2p|. 
c It is not certain that the observed 1 2+ state (called b') with these molecular constants 
is the predicted v 2 OgVg, 1 S u + state (see text). 

“ Estimated. 


as lo g lo u (or lo g 2p<r), % + and % + , represent a familiar example. This is 
an extreme case where the difference in dissociation energies and products 
exercises a dominant influence on R e and co e . 

More typical is the situation illustrated by the CH states in Table 1, and 
most of the N 2 states in Table 2, where a common MO electron configuration 
tends strongly to establish definite R e and co e values. However, it should be 
noted that in the examples cited, the electron configurations of the separate 
atoms (Is + ls 2 2s 2 2p 2 for CH and ls 2 2s 2 2p 3 + ls 2 2s 2 2p 3 for N 2 ) are the 
same for all dissociation products for a given MO configuration even when 
the D values vary because of different states of the separate atoms. In the 
case of T and V states of an MO configuration, this condition is not fulfilled, 
and just in that case the R e , co e near-constancy breaks down. In the case of N 2 , 
it is probable that a similar break-down occurs for the state of the 
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7 iy(jg 7 ig c o n ft g u r a t i o ii, as compared with the other states. Unfortunately the 
empirical identification of this state is not certain, but trustworthy theoretical 
calculations predict that it differs strongly from all the other states of the 
same configuration in being much higher in energy and in being much more 
ionic in LCAPAS character than all the others. This last characteristic strong¬ 
ly suggests that it should behave like a V state, with small a> e and large R e , 
as is true of the tentatively identified observed state whose molecular contents 
are listed in Table 2. The questions touched on above, pertaining to the factors 
which determine R e and c o e values, deserve further exploration. 


IV. Theoretical Criteria for MO Bonding Powers 

Theoretically computed MO overlap populations per electron, njN h 
calculated for molecules at R values in the MO region, seem to be rather 
good as roughly quantitative indicators of the bonding properties of individual 
MO’s in the case of valence-shell and inner-shell MO’s (Mulliken, 1955). 
(Here n t means the overlap population in the /th MO when occupied by N t 
electrons.) Among other things, computed nJNi values correlate rather well 
with values of (A/? e ) t - for removal of an electron. The quantity nJNi can be 
calculated for any MO in situ in a molecule, using information from SCF-MO 
calculations. [A slightly better quantity should be An t , the change in n t when 
an electron is removed from the /th MO, with R constant.] 

The validity of overlap populations as indicators of bonding is of course 
restricted to regions of R where MO’s can be reasonably well approximated by 
LCMAO expressions based on separate-atom AO’s; as is well known, the 
extent of positive or negative overlap of these is well correlated with bonding 
or antibonding repectively. However, the validity of the overlap population 
criterion has no rigorous theoretical foundation, and it is fallible in special or 
extreme situations. A very different kind of measure of the bonding effects 
associated with electrons in individual MO’s seems to be afforded by an 
application of the Hellman-Feynmann theorem, in terms of forces exerted 
on the nuclei by the electrons in various MO’s (Bader and Henneker, 1965). 
However, while the method seems to have interesting possibilities, until now 
there seems to be extremely poor correlation between the computed force 
exerted by an electron in an MO and its bonding power as judged by the 
thermochemical or equilibrium criterion. 
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I. Introduction 

Motivated by a desire to bridge and if possible to stem the widening gap 
between the “computers” and the “non-computors” in molecular quantum 
mechanics as so disturbingly defined by Coulson (1960), this paper and others 
to follow presents directly and on a consistent basis pictorial representations 
of certain quantum-mechanical concepts. Here accurate charge density 
pictures of molecular orbitals (MO’s) very close to the Hartree-Fock MO s 
(Wahl, 1966) are given. 

Recently much has been written elsewhere about the development and 
present state of the molecular-orbital method (Lowdin and Pullman, 1964; 
Slater, 1965; Nesbet, 1965), therefore we shall confine ourselves to a brief 
statement of crucial steps in its history. 

The molecular orbital method as introduced by Mulliken (1928, 1929, 
1932) and Hund (1928) was used extensively in the semiempirical interpreta¬ 
tion of band spectra. However, mathematically and computationally the 
concept matured rather slowly. Its early development (and the search for the 

* This research was supported by the following grant: National Aeronautics and Space 
Administration NsG-275-62 and based in part on work performed under the auspices of 
the USAEC 
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“best” MO’s) may be traced from the recognition by Lennard-Jones (1929) 
of its relationship to Hartree’s (1928) self-consistent field work on atoms 
followed by the introduction of the determinantal form for the wave function 
(Slater 1930a, 1932) with the application of the variational principle (Fock, 
1930; Slater, 1929, 1930b) to yield the now familiar pseudo-eigenvalue 
equations of the form 


F<pi = SiCPi ( 1 ) 

known as the Hartree-Fock equations, which provide a rigorous mathematical 
definition of best orbitals. Lennard-Jones (1929) presented the equations 
for an arbitrary system; Coulson (1938) foreshadowed their solution by the 
expansion method; and Roothaan (1951, 1960) developed and perfected the 
extensively used matrix formulation of the expansion method. Important 
also are the proofs by Delbriick (1930), Lowdin (1962), and Roothaan (1960) 
that the Hartree-Fock functions are always self-consistent, symmetry-adapted, 
and correspond to a specific minimum of the total energy. Extremely relevant 
to the potency and appeal of Hartree-Fock wave functions was the work of 
Brillouin (1933,1934), Moller and Plesset(1934) on corrections to the Hartree- 
Fock approximation. They showed that one-electron properties computed 
from Hartree-Fock wave functions have first-order corrections in perturbation 
theory which vanish provided that degeneracy is not present. Koopman (1933) 
developed similar theorems for ionization potentials. 


II. Hartree-Fock Molecular Orbitals as Linear Combinations of 

Expansion Functions 

Having clearly defined the Hartree-Fock model of a molecular system it 
still remained a formidable practical problem to obtain the MO’s q > { . In 1951, 
Roothaan had cast the Hartree-Fock equations into a solid computational 
framework remarkably suitable for the then-embryonic digital computers. 
In what is now referred to as the Roothaan (1951, 1960) method the orbital 
(Pi is expanded in terms of some suitable truncated basis set % P 

( Pi = 'LC ipXp . (2) 

P 

The expansion coefficients C ip are optimized through the iterative self- 
consistent field process (Roothaan and Bagus, 1964). In the full numerology 
of the process the best truncated set of basis functions x p is also hunted down, 
usually by brute force methods. In practice a very close approximation to the 
Hartree-Fock molecular orbitals can be obtained in this way. Calculations of 
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this type utilizing analysis and computer programs developed recently (Wahl, 
1964; Wahl et al., 1964) have resulted in the determination of the molecular 
orbitals for a large number of diatomic molecules in the form of Eq. (2). These 
functions, in which the basis set x P consists of many Slater-type orbitals 
(STO’s), are very close to the Hartree-Fock result. They were used in the pic¬ 
torial calculations presented in this work.* 


III. Densities and Contours 

At this point, in order to clarify the diagrams of the shell model, it is 
convenient to introduce two new indices X and a which indicate respectively 
the symmetry species and subspecies of the molecular orbitals cp t . The 
electronic density p iX associated with the /2th molecular shell at a point r in 
space is defined by 

P u( r ) = e~N iX d x 1 £ cp iXa (r)(pf x fr) (3) 

a 

where we have now grouped the molecular orbitals cp iXa according to their 
symmetry species X and their subspecies a and have defined the density of 
shell iX which contains N ix electrons in terms of the sum over the modulus 
squared of the d x degenerate molecular orbitals making up the shell. Here e~ 
is the charge on the electron (negative number). The total electron density 
p(r) of the molecule is then given by 

p(r) = Z Z PuOO ( 4 ) 

X i 

and is thus the sum of the densitites of all shells making up the molecule. 

The density associated with one of the d x degenerate molecular orbitals 
(p iXa making up shell iX is 


Pm 0) = Pix(r)/d x (5) 

which is just the shell density divided by the number of degenerate molecular 
orbitals making up the shell. In the diagrams presented in this paper it is the 
total density [Eq. (4)] and the orbital density [Eq. (5)] which have been plotted. 

* Wave functions: The molecular orbital SCF wave function used F 2 (Wahl, 1964); 
0 2 (Malli, and Cade, 1966); N 2 (Cade et al ., 1966); C 2 and B 2 (Greenshields, 1966); 
Li 2 (Sales et al., 1966); H 2 (Das and Wahl, 1966). All wave functions are comparable 
in sophistication to the published F 2 and N 2 functions. The basis sets have been extensively 
optimized and explored. All these MO wave functions can be obtained on request from 
the author. 
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TABLE 1 

Electron Configuration of Molecules 


Total Spectroscopic 

molecular density Molecular shells designation 


h 2 

1 








He 2 

W 

1 ol 






‘s; 

Li 2 

1 o\ 

1 ol 

20g 






Be 2 

1 o\ 

1 ol 

2a| 

2 ol 





b 2 

lal 

1 ol 

2cr| 

2ol 

1 Trl 



3 s 9 - 

c 2 

\a] 

1 ol 

2 ol 

2 ol 

lTTu 




n 2 

1°9 

1 ol 

2a 2 g 

2 ol 

Ini 

3 erf 



0 2 

w 

1 ol 

2<y 2 g 

2 ol 

Ini 

3cr| 

ln% 

3 S,- 

f 2 

1 O* 

\ol 

2o$ 

2 ol 

Ini 

3 erf 

1 n% 

‘2; 

Ne 2 

lot 

1 ol 


2 ol 

Ini 

3crf 

\ni 3crl 



note: Superscript of 2 or 4 indicates number of electrons N iX occupying molecular shell 
/A. For the n shells, which consist of 2 degenerate molecular orbitals, it is a molecular orbital, 
containing £ of the electrons in the n shell , that has been plotted. 


(For <j symmetry d x = 1 and thus the orbital density equals the shell density. 
For n symmetry in diatomic molecules, d x = 2 and the orbital density equals 
j of the shell density. The molecular shells and their occupation N ix are given 
in Table 1 for the molecules studied. The only molecular symmetries occurring 
in this work are <r g , <r M , iz g , and n g .) 

In what follows in this section the symmetry indexes X and a of the orbita 
density p iXa (r) will be suppressed since they are unnecessary for the description 
of the contour drawing process. 

An orbital contour line indicating a density C in the xz-plane (p and p t for 
diatomic molecules are cylindrically symmetric about the z-axis and plots in 
any plane containing this axis convey complete density information) may be 
defined by the equation, 


P;(x, z) = C, 

and its path by the relation, 


^A* + ^Az = 0, 

OX oz 


which gives the direction of the tangent to the contour at any point on it to be 

Ax _ dpi/dz 
A z dpi/dx' 


( 6 ) 
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A step As = (Ax 2 + Az 2 ) 1 /2 is taken along this tangent and a density found 
such that 


p'i(x + Ax, z + Az) = C + A 


( 7 ) 


then a correction is applied perpendicular to the initial tangent along the new 
line 



dpi/dx 

dpjdz 


a distance 


Az' =-—-. (8) 

dp’j dp'j dpjdx 

dz ,+ dx' dp i/dz 

This correction [Eq. (8)] is continued until A p t falls within a small preset 
threshold. This hunt process [Eqs. (6-8)] is continued until the entire contour 
is traced out. Analogous equations result for the total molecular density or 
for any linear combination of molecular orbital densities. 

The input to the computer program consists of the symmetry basis functions 
Xpxa, the orbital coefficients C ip , the internuclear distance, a series of the con¬ 
tour values desired with the associated thresholds, and finally the physical 
scale in which diagrams are to be plotted. The output consisted of 35 mm 
negatives of the diagrams presented in this work. The process has been more 
completely documented elsewhere (Wahl, 1966). 


IV. Results and Implications 

In Fig. 2 the contours of density associated with the homonuclear diatomic 
molecules constructed from first row atoms are given on a consistent basis 
as defined in the key (Fig. 1). Both the total molecular densities and the orbital 
densities are displayed. [These diagrams are available in a larger scale (Wahl, 
1966).] It is hoped that in addition to their obvious tutorial value these contour 
diagrams of the molecular-orbital model for these simple homonuclear 
diatomic molecules, H 2 , Li 2 , B 2 , C 2 , N 2 , 0 2 , and F 2 , will prove to be useful 
symbols which will stimulate thought about chemical binding, steric hindrance 
bonding and antibonding orbitals, in addition to providing a correct and more 
complete picture of molecular orbitals were only a rudimentary one, based 
primarily on hydrogen atom wave functions and single STO’s, existed before. 

Using these computational techniques, concepts and changes which are 
best presented visually may be so presented. Such visual presentations have 
been quite limited in the past due to the prohibitive labor involved (Huo, 
1965; Peyerimhoff, 1965). Studies of interatomic forces and the formation of 
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Fig. 2. Comparison of density of molecular orbitals. Larger scale diagrams are available, see Wahl (1966). 
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the chemical bond using extended Hartree-Fock wave functions (Das and 
Wahl, 1966) are underway in which these programs are being used to display 
the changes occurring in electronic charge density as a molecule forms. In a 
study of molecular ionization these automatic contour programs are being 
used to illustrate directly changes in the molecular charge distribution with 
electron removal. In other theoretical work a pictorial display of configuration 
mixing provides a physical picture of wave-function improvements and elec¬ 
tron correlation as produced by added optimal configurations. This work 
contains the development of a new tool; namely, the synthesis of high-speed 
digital computers and linked analog devices into a medium capable of effici¬ 
ently communicating certain types of new information. Since many of us 
involved in large scale computational efforts are often swamped by our own 
computer output and are able to competently analyze only a small fraction 
of the potentially useful information we have generated, this problem of 
communication is well worth consideration (Coulson, 19601 
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Localized Orbitals and Localized 
Adjustment Functions 
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Soon after the acceptance of Slater determinants as the most useful anti¬ 
symmetric functions for electronic quantum calculations it was realized that 
these were left invariant by any unitary transformation of the orbitals, and 
that the new orbitals could be chosen to be the most suitable for the investiga¬ 
tion of a particular physical problem [see Pauling (1931) and Slater (1931)]. 
The most interesting case is when a transformation is made to orbitals which 
are localized as much as possible. 

The early investigations of such transformations were mostly for particular 
approximations such as that where atomic orbitals are hybridized to give 
functions more appropriate for use in approximate molecular calculations. 
Later treatments have approached nearer to the general molecular problem. 
The well-known equivalent orbitals investigated by Hall and Lennard-Jones 
(1950) provided a general method for molecules with certain symmetry. 

The first practical method of construction of localized orbitals for a general 
molecule was given by Foster and Boys (1960) by their definition of exclusive 
orbitals. These were defined in terms of the matrix elements of the dipole 
moment operator between all the orbitals of the molecule and they were 
calculated by a fairly simple iteration procedure using these 3 n matrix 
elements. Later Edmiston and Rudenberg (1963) performed a calculation 
with the criterion of the minimization of electrostatic exchange energy, which 
was an alternative definition discussed by Lennard-Jones and Pople (1950). 
This was a rather heavy iterative calculation dependent on the n matrix 
elements of the two-electron electrostatic integrals. 

A completely new addition to the localized orbitals was introduced by Foster 
and Boys in an attempt to define some localized functions suitable for use in 
the additional Slater determinants which have to be combined with the first, 
or SCF, determinant to allow for electronic correlation. These new localized 
functions were described as oscillator orbitals. Unfortunately the original 
definition had a deficiency which makes it satisfactory only for the lowest 
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range of members. This is remedied in the formulation given below. The new 
definition satisfies the same aims as originally described. In the calculation for 
the formaldehyde molecule by these authors only the lowest oscillator orbitals 
were used and these are unaffected by the change in the definition. 

Such a general scheme of localization is of considerable interest because it 
provides the only method yet proposed for a localized analysis of induced 
phenomena and other properties of molecules and crystals. Any such property 
of the whole system can be expressed as a sum of contributions each of which 
can be ascribed to a particular localized electron pair, or to an interaction 
between two of these. In the latter case the contribution to the property can 
have half of its magnitude ascribed to each pair. This appears to provide a 
complete quantum analogue to the way in which it was once hoped with 
classical mechanics to attribute all phenomena to contributions from individual 
electrons or the interactions between pairs of these. All this can be set out 
formally once a system of localized orbitals and expansion functions has been 
defined by some procedure such as the exclusive oscillator system of functions 
proposed by Foster and Boys. The detailed definitions which seem most 
satisfactory at the present will now be set out. 

The exclusive orbitals will be defined first, since the oscillator orbitals are 
later defined in terms of these. The definition proposed here is a minor 
modification of that given by Foster and Boys (1960). The reason for the 
change and the relation between the two forms will be shown later. It will 
also be shown that there are several equivalent forms of the present statement 
and that although the one given now is the most compact, a later one is the 
simplest for computation. The present condition is that the sum of the quadra¬ 
tic repulsions of the orbitals with themselves shall be minimized. This can be 
written as the minimization of 

n 

I = <<P a <Pa\r 2 l 2 \<Pa<Pa> ( 1 ) 

a 

subject to the condition that the cp a are orthonormal linear combinations of 
the space factors of the 2n occupied electronic orbitals. The whole of the 
present analysis is restricted to the circumstances when the Slater determinant 
contains each spatial orbital both with a and ft spin functions. The above 
expression is written in the usual quantum matrix element notation and, in a 
product of functions at either side, it is implied that the first function depends 
on r ls the second function on r 2 , and so forth. Thus in a single electron integral 
we shall write <<p 0 |r?|<p a > to mean that all the quantities are dependent on 
x t , y t , z v It will be shown later that the above definition is equivalent to the 
maximization of the sum of the squares of the distances of the orbitals from 
each other and so is similar to the original definition. 

When these orbitals have been calculated for a molecule under consideration 
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the oscillator orbitals are to be derived from a system of primary functions. 


*Papqs X 'y a^u^P a (^) 

Here the coordinates x a , y a , z a are defined to be measured from the centroid 
of the orbital density (p*cp a and with respect to axes taken parallel to the prin¬ 
cipal axes of the moment of inertia tensor of the distribution (p*cp a . The 
longest axis is to be ascribed to x and the shortest to z. 

In this notation the functions with p = q — s = 0 are just the original 
exclusive orbitals, which can be regarded as the early members of the whole 
sequence. The final functions which will be obtained after a special ortho¬ 
normalization process will be denoted by (p apqs . The orthonormalization 
method can be regarded as a Schmidt procedure between classes of different 
degree, which is defined to b ep + q + s, and a Lowdin-type orthonormalization 
within each class of specified degree. Hence within a class of given degree the 
order of the functions does not affect the result and any symmetry among the 
cp apqs also occurs among the (p apqs . There is always a one-to-one correspond¬ 
ence between the original and the derived set and if the former were ortho¬ 
normal then the orthonormalization procedure would cause no change. 

The quantity p + q + s will be called the degree of a particular function 
cp apqs and the whole system will be considered as arranged in sets of the same 
degree with the sets in order of increasing degree. Let some arbitrary choice 
be specified for the order within these sets and let ij k denote the «th member 
of the fcth set of the <p apqs so arranged. Let rj k u be the (p apqs which is derived 
from this rj k u . 

The most compact definition of the orthonormal rj k u is given by describing 
the procedure for rj k u when all the rj‘ v of lower degree have already been cal¬ 
culated. For the first stage it is postulated that 

(PaOOO = WaOOO *-*r tfu Vu • 

For the functions of any degree k there are defined a set of intermediate 
functions 


n k u = 


k- 1 


Y J ri l vWv\y k u> 


lv 


k- 1 


<«> -1 <M>W,\nZ> 


lv 


- 1/2 


( 4 ) 


This is just the relation of the Schmidt method of orthonormalization to make 
each rj k u orthogonal to all the functions of lower degree, though not to the 
functions with the same k value. The next stage is to make these orthonormal 
to each other by a procedure which is closely related to the Lowdin procedure 
of making orthonormal linear combinations. The rj k are defined to be those 
linear combinations 


rfl = I 


( 5 ) 
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of the r\ k v which maximize 

/=Re£<«> (6) 

u 

subject to the condition that they are orthonormal, that is 

<^l^> = $uv • ( 7 ) 

Since the sum of the overlaps with the original functions is being maximized 
the r\ k may be said to be the best approximations to the original rj k which 
are also orthogonal to the lower sets and which are orthonormal to each other. 

When the orthonormal t] k have been obtained it is possible to revert to the 
original type of notation and to denote these by (p apqs where the values of 
a, p, q , and 5 are the same as in the original (p a x p y q z s a from which this was 
derived. However for physical theoretical work it is most convenient to replace 
the last three suffixes by a single suffix m. Let m = 0, 1,2, ..., stand for all the 
combinations of p, q , s arranged in order of increasing value of the degree, 
and in descending order of p within a set of the same degree, and in decreasing 
order of q within a set of constant p and constant degree. In principle any 
other choice which gave an unambiguous labeling of all the functions derived 
from a given exclusive orbital q> a or (p a0 would be satisfactory. It can, however, 
be noted that with this choice, when <p a0 corresponds to a chemical bond, the 
first oscillator orbital (p al is an approximation to the antibonding orbital of 
approximate theories, while cp a2 and <p a3 could be said to correspond to trans¬ 
verse polarizations of the bond. 

It will be assumed that the oscillator orbitals (including the exclusive 
orbitals as first members) will be a complete system of functions. They will be 
complete if the original nonorthogonal system (p a x p y q z s a is complete, and al¬ 
though we cannot prove this, it is very likely to be true for this type of function. 
In fact by its general nature a partial set of the type (1 s)x p y q z s , arising from 
an inner shell, is likely to be complete. In any case it is not likely in the fore¬ 
seeable future that more than ten oscillator orbitals per exclusive orbital will 
be used in a practical calculation and these early functions appear very suitable 
to be good expansion functions for most purposes. 

The most direct application of the oscillator orbitals is to express in a 
compact form the more accurate wave function approximations which consist 
of linear combinations of many Slater determinants. Such forms allow for the 
effects of electronic correlation and are frequently described as including 
configurational interaction. All the Slater determinants except the first 
depend on other single electron functions as well as the occupied orbitals 
and it was suggested by Foster and Boys (1960) that the oscillator orbitals 
would be very effective for these. In principle if any complete system of single¬ 
electron functions were used in the Slater determinants the result would con¬ 
verge to the correct answer and if the above definition of oscillator orbitals is 
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satisfactory this should be true for them. In this case any wave function can 
be approximated to sufficient accuracy if sufficient Slater determinants are 
used but it is also to be expected that the Slater determinants will only occur 
with coefficients which fall off rapidly as higher substitutions of the exclusive 
orbitals by oscillator orbitals are made, or if exclusive orbitals are replaced 
by oscillator orbitals from more distant sets. In various practical calculations 
the first Slater determinant, which consists only of the exclusive orbitals 
each associated with an a and p spin, has been found to have coefficients in the 
range 0.97 to 0.99. The other Slater determinants are conveniently denoted 
by means of substitution operators P{x,y, ... /x',y', ...) which denote the 
operation of replacing x' by x, y' by y, etc. 

In this notation the most important terms after the first may be written as 
P((pai<x(PaiP/ ( Pao 0C aoP)®i’ which shows that the particular exclusive orbital (p a 
has been replaced by its first oscillator orbital in both the cc and p functions. 
With the above conventions this first oscillator orbital has been chosen to be 
a function which approximates to the ordinary antibonding orbital. In this 
case, a Slater determinant with the above structure is usually found to have a 
coefficient in the range from -0.07 to -0.09. Such terms increase the 
probability of finding the electrons at the opposite ends of a bond. It is 
convenient to letP(<p al <p b denote all the relevant spin combinations 

when one of the exclusive orbitals (p a0 is replaced by its first oscillator (p al 
and another (p b0 is replaced by its first oscillator orbital. Slater determinants 
of this type have been found generally to occur with coefficients whose mag¬ 
nitudes are in the range from 0.02 to 0.03. They correspond to the longitudinal 
correlation between the electrons in different bonds. Not many such calcula¬ 
tions have yet been performed and in these it has not been possible to include 
functions of as high a degree as is desirable. But the available evidence does 
suggest that the coefficients fall off rapidly and systematically as more distant 
replacements or replacements of higher degree are made. 

The value of this analysis is not only that the largest terms are obtained at 
the beginning, it lies also in the fact that each of these terms corresponds in 
some definite way to the bonds in a particular locality so that we might expect 
similar terms with the same size of coefficients in a similar locality for a cal¬ 
culation on another molecule. Of course, the neighboring bonds would have 
to be the same for a close correspondence, but then we might expect consider¬ 
able similarity. The term corresponding to the correlation within a single bond 
would probably agree within 10% or so, and also terms such as the above, 
corresponding to the correlation between neighboring bonds of some specified 
chemical types, would be expected to be approximately invariant. 

Consider other effects of chemical interest such as the change at a few bonds 
distant when a C atom is replaced by a N + ion. In such a case it would be 
interesting not to recalculate the new wave function from the beginning but 
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to find the changes in the orbitals in terms of the exclusive and oscillator 
orbitals of the original molecule. This can be done by a calculation of the 
perturbation type. We should expect similar changes in similar molecules and 
that the same changes could be predicted for other similar molecules without 
further calculation. It would be expected that the changes in any exclusive 
orbital, when expanded in terms of all oscillator orbitals, would only have 
appreciable coefficients for the oscillator orbitals corresponding to this 
exclusive orbital itself and to its nearest neighbors. Such a calculation would 
be quite straightforward for the single Slater-determinant approximation but 
it would require sophisticated analysis for a general case which included 
configurational interaction. 

Consider a perturbation of the whole system by some field V. Such calcula¬ 
tions are most frequently performed in terms of the Hamiltonian eigenfunc¬ 
tions, but, at the expense of introducing reciprocals of nondiagonal matrixes, 
it is possible to perform them in terms of any system of expansion functions, 
and this can be done in terms of the oscillator orbitals. In this case the first- 
order perturbed value D x of an observable with operator D is found to be 
linearly dependent on the matrix elements of D and the coefficients of these 
to be linearly dependent on the matrix elements of V, so that the total observed 
effect can be written 

T> 1 = £ £ c ( a > b, c, d, k, /, p, qK(p ak \D\(p bl 'X(p cp \ V\(p dq ) (8) 

abed klpq 

where the C(a, b, c, d, k, l, p, q ) are the appropriate coefficients to be evalu¬ 
ated in the calculation on any particular molecule. It is now possible to define 
a quantity which measures the effect arising in the vicinity of (p a due to the 
perturbing effect acting in the vicinity of q > d . Let D ad be defined to be 

D a d ~ X X b, c, d, k, l, p, q) / \(p a h\D\(p b ^}(^(p c \ V\<p d )>. (9) 

be klpq 

Since the original D l is real, the Re, denoting the real part operation, can 
be omitted before it and it follows that D 1 = £ ad D ad . It can be seen that 
normally the largest contributions to D ad will come from the terms in which 
b = a and c = d and that the contributions will be very small if (p b is at all 
distant from q> a etc. This is the main justification for taking this expression 
to be the effect caused at cp a by the field at <p d . It can be seen that if the <p a 
were nearly separate from each other, as would be the case in a system of a 
few separate He atoms then D ad would measure the physical effect caused in 
one atom by the field acting at another atom. The symmetric terms D aa would 
normally be expected to be the largest terms since they represent the effect 
at one orbital due to the field acting on the same orbital. 

The above definition is a mathematical way of representing quantities which 
the chemist would intuitively describe as the effect transmitted from one bond 
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to another. However the setting out of the details of the mathematical analysis 
for particular properties will be quite complicated and the construction of a 
systematic method of computation will be still more complicated. G. W. 
Smith and the writer have made an exploratory calculation of this type for 
ethylene. This calculation has been found to be difficult even though programs 
were available for general molecular integrals, and the calculation has only 
been performed to the refinement of using one oscillator orbital per exclusive 
orbital. However the subsequent development after a first beginning is fre¬ 
quently much more rapid than appears feasible at such a time. Also once the 
formulation of the general representation becomes familiar it is quite possible 
that alternative methods of calculation may become apparent. 

It is interesting to see that the definition of exclusive orbitals which was 
given above is equivalent to several other simple definitions. It should be 
noted that for the use of the second criterion given below it is only necessary 
to know the dipole moment matrix to calculate the exclusive orbitals. Let R a 
be defined by 

K = <<Pa\ri\<Pa> ( 10 ) 

so that this is the centroid of the orbital (p a . Let r la = Tj — R a so that the 
criterion which was minimized for the exclusive orbitals may be written 

/ = X <( PaVaVllWaVa > = Z (‘TVPJO’la “ 

a a 

= Z (<Pa<f>oV\a + r 2a ~ ^ la ‘T 2a \(p a (p a } 

a 

= 2Z <<Pa\ria\<Pa> ~ 2 Z <<A>l r l “ R <M> ' <^I F 1 “ K\<Pa> 

a a 

= 2Z <(p a \rU(p a >- ( U ) 

a 

The definition of R 0 justifies the omission of the term shown in the last 
equality. This last expression is just the sum of the spherical quadratic 
moments of each exclusive orbital about its centroid and it shows that the 
minimization process is equivalent to making each exclusive orbital contract 
as close to its own centroid as can be done while retaining the condition that 
these are orthonormal linear combinations of the original orbitals of the given 
Slater determinant. 

The next criterion follows by writing with another small alteration of form. 
/ = Z <<Pa<Pa 10*1 - r l) 2 \<Pa<Pa> 

a 

= 2Y J <(pa\ri\(p a y - 2Z <<PJ r il^>*<^l r il^> 

a a 

= -2Z R2 + 2Z <<?>!!<?>*>• 


( 12 ) 
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If this is taken for the practical form to be minimized it is apparent that the 
second term does not enter into the minimization because it is invariant for 
all unitary transformations. Hence only the first term — Ra nee d be mini¬ 
mized and this depends only on the matrix elements of the dipole moments 
matrixes. This maximization of R 2 is used in our current programs. 

It can also be shown that the maximization of J the sum of the squares of the 
distances between the centroids of the orbitals is equivalent to the minimiza¬ 
tion of the last form of /. This can be written 

J = Z [<<P>il<P a > ~ <<P 6 !ri!<P 6 >] 2 

ab 

= 2n £ <<p>il<p„> 2 - 2£ £ <<P>i!<P a >-<<Pfc|ril<P fr > 

a a b 

= 2n^R 2 - 2[£ <<pjr 1 |<p a >] 2 . (13) 

a a 

The second term is invariant for unitary transformations of the orbitals so that 
the maximization of J is equivalent to the minimization of / which was equiv¬ 
alent to the other definitions of the exclusive orbitals. 

It may be noted that if R ab denotes the distance between the centroids of 
cp a and cp b then the above exclusive orbitals are given by the maximization 
of Y,a>b Rib while the original condition suggested by Foster and Boys was 
The writer does not regard this change as having a very great 
effect. He is in favor of the sum form because one system involving an 
activated complex was found for which the product criterion did not converge 
in a reasonable physical way while the sum definition gave no difficulty. The 
fact that the sum criterion has these equivalent definitions with simple physical 
interpretations also appears to be in its favor. 

It is apparent from the above analysis that the difference between the present 
exclusive orbitals and the localized orbitals of Edmiston and Rudenberg 
can be expressed by saying that the former use an r\ 2 repulsion field where 
the latter use the r\ 2 Coulomb field. It might be claimed that the latter is more 
natural to physical systems but the necessary calculation in this case involves 
iterations with an n 4 matrix while the r\ 2 field only uses the n 2 dipole matrix. 
It is the opinion of the writer that the exclusive orbitals are more convenient 
for ordinary and easy use. For the further stage of localized adjustment 
functions there are at the present no alternatives to the oscillator orbitals. 
It is probably these which will have the more far reaching consequences for 
the interpretation of physical phenomena as due to localized contributions 
arising from the electron pairs in chemical bonds, lone pairs, and inner shells. 
The definition of these could be appended to the orbitals calculated by the 


Localized Orbitals and Localized Adjustment Functions 


261 


method of Edmiston and Rudenberg but it is probably more uniform to use 
them with the exclusive orbitals, since both of them depend on relatively 
simple transformations of the single-electron moment integrals. 

There is nothing in the mathematics to prevent the application of these 
methods to crystals such as metals, or to molecules with large systems of 
conjugated Tl-orbitals, but in these cases the localized orbitals would not be 
localized to the degree which is to be expected in systems whose bonds are 
all of the covalent type. For the former systems the best localized orbitals 
would penetrate each other to an extent which would make any analysis into 
properties of individual localized bonds and interactions between these only 
a formal mathematical process. In a covalent bond system it is to be expected, 
and has been found in several cases, that the penetrations and interactions do 
fall off quite rapidly as the combinations which are more removed from 
nearest neighbors are considered. This is, of course, to be expected from the 
general chemical evidence on how little bonds which are somewhat removed 
from nearest neighbors do influence each other. From the purely theoretical 
point of view the analysis can be applied to any system but if the results of 
this show that the orbitals do penetrate each other much more than for other 
covalent systems, and the interactions do not fall off sufficiently rapidly as 
orbitals a few removes from each other are considered, then the application 
to such a system had better be abandoned. It is not to be expected that this 
will be found in molecules or crystals which would normally be represented 
as having a structure of covalent bonds. 

It is not feasible here to set out the details of the corresponding analysis for 
a wave function which is expressed as a linear combination of a number of 
Slater determinants. But if these are expressed in terms of a scheme of exclu¬ 
sive and oscillator orbitals it is possible to assign all the contributions to any 
simple property, or phenomenon, to the individual exclusive orbitals. If a 
contribution is dependent on oscillator orbitals of one exclusive orbital then 
this is counted in the total contribution of that exclusive orbital. If a contribu¬ 
tion is dependent on two oscillator orbitals from different exclusive orbitals 
then it is assigned half to one and half to the other. Such a scheme is artificial 
but it does provide a definite meaning to the contributions of particular 
bonds, or lone pairs, etc., to any particular phenomenon. This is very close 
to the general intuitive picture which would be used in the analysis of empirical 
data on properties of molecules or crystals. 

The construction of such a scheme is possible whenever the orbitals of a 
Slater determinant are first transformed to localized orbitals and then a system 
of expansion functions suitable for a general perturbation calculation is con¬ 
structed so that these are localized as closely as possible to the first localized 
orbitals. 
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I. Introduction 

The investigation into the nature of localized atomic and molecular orbitals 
which was begun in two earlier articles (1,2) is further pursued in the present 
note by examining more complex systems. Two subjects are examined (a) For 
several polyatomic molecules the localized molecular orbitals are determined 
and discussed and, in particular, the correlation between this theoretical 
approach and empirical chemical intuition is examined, (b) For several non¬ 
trivial systems, the relationship between localized orbitals and equivalent 
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orbitals is analyzed and it is found that the latter do not always exhibit max¬ 
imum localization. 

Since the basic premises used are the same as those in the previous papers 
(7,2), the introductory remarks made there apply also to the present investiga¬ 
tion. Localized molecular orbitals are denoted as LMO’s. 


II. Localized Molecular Orbitals in Simple Polyatomic Molecules 

In the following, localized molecular orbitals are discussed for the molecules 
H 2 0, NH 3 , CH 4 , C 2 H 6 . In all cases, they are obtained from minimal-basis- 
set LCAO-SCF wave functions based on Slater orbitals.* For water, the 
molecular orbitals of Ellis (2) were localized, for ammonia those of Duncan 
(4) were localized, and for methane a wave function of Pitzer (5) was 
chosen. For ethane, the LMO’s were taken which Pitzer had calculated (<5) by 
applying the authors’* method to a wave function which he had obtained 
earlier with Lipscomb (7). 

The localized molecular orbitals of H 2 0, and NH 3 are given in Tables 1 
and 2, respectively. The data in these tables are arranged exactly in the 
same manner as those in the tables of reference (2): The first matrix contains 
the exchange integrals for the canonical MO’s, the second contains those for 
the localized MO’s, the third matrix represents the transformation connecting 
the canonical and the localized MO’s, the fourth matrix contains the LCAO 
expansions of the localized MO’s in terms of the orthogonalized Slater AO's. 
The localized molecular orbitals of ethane and methane are given in Table 3. 
In Table 4, there are given the normalized atomic components of the localized 
molecular orbitals on various atoms in several molecules in a way which 
permits comparison of related atomic hybrids in different molecules. 

A. H 2 0 (table 1) 

The LMO's of water are seen to consist of an inner shell, two equivalent 
oxygen lone pair orbitals, and two equivalent bond orbitals extending about 
equally over the O and H atoms of the bond. The normalized atomic com¬ 
ponent to the inner shell, given in Table 4, is quite similar to that of the 
oxygen atom in CO, and also that of the free O atom, which were found in (2). 

A question of interest is the angle between the two equivalent atomic s-p 
hybrids, which the oxygen atom contributes to the equivalent bonding 
LMO’s. It is found to be about 90°, not too close to the bond angle (104.5°), 
but not unreasonable. Nevertheless, the normalized hybrids contain consider¬ 
able s-character (about one-half as much as an sp 3 hybrid). Consequently, 


* That is, Slater-type orbitals (STAO's) with Slater’s exponent values. 
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they are nonorthogonal, with an overlap integral of 0.134 (equivalent ortho¬ 
gonal s-p hybrids, enclosing a 90° angle would be pure p orbitals). 

On the other hand, if equivalent MO's are constructed in such a way that 
the lone pair LMO’s contain no contributions from the H atoms, then the 
oxygen bond-hybrids form an angle of 74.9° between them. Finally, an angle 
of 73.6° is found between the two bonding hybrids on oxygen, if they are 
obtained by an equivalent orbital transformation between the two MO's 
[(3 a^andfl T> 2 )] which have the highest energies among those being symmetric 
in the plane of the molecule. Thus, the LMO's obtained by the present method 
yield the most reasonable hybrids. 

Returning to the LMO's, we find the angle between the oxygen components 
of the lone pair LMO's to be 124°. They are directed above and below the 
plane of the molecule. The overlap between the normalized oxygen component 
to one of the lone pair LMO's and the normalized oxygen component to one 
of the bonding LMO is —0.010. The mutual overlap between the two normal¬ 
ized oxygen lone pair components is 0.071. Thus, the long held assumption 
that atomic hybrid orbitals are very nearly orthogonal, is approximately, 
but not quantitatively correct. 

B. NH 3 (table 2) 

The LMO's for NH 3 consist of an inner shell, a lone pair on the N atom, 
and three equivalent bond orbitals. The bond orbitals found in NH 3 are 
slightly less equivalent than the equivalent LMO's found in the other mol¬ 
ecules. This is probably due to a small error in the input integrals which could 
not be located.* 

According to Table 4. the N component to the lone pair contains much 
less s-character than the lone pair in N 2 which was given in Table 15 of ref. 
(2). This appears to be the only great deviation thus far found for the forms 
of two chemically similar LMO's. 

The inner shell of NH 3 has a larger than usual contribution from the 
Slater Is orbital. This also occurs in NH and CLH 6 (see below), but not in 
H,0 or HF. 

The angle between the equivalent nitrogen hybrids which form the N 
contributions to the bond LMO's is 104.5° (the HNH bond angle is close to 
107°). Correspondingly, the angle between one of the bonding hybrids and 
the lone pair hybrid is 114°; the corresponding overlap is -0.02S. The mutual 
overlap of two of the bonding hybrids is 0.004. 

Using the same wave function. Duncan (4) constructed localized MO's 
assuming that the H orbitals do not contribute to the lone pair. This leads to 

* The deviations from complete equivalence in the other molecules merely reflects the fact 
that 8-figure accuracy in the self-energy allows only about 4-figure accuracy in the LMO's. 


Looali/to Mouoiu AR Or RITA US IN N \ { x 


Localized Atomic and Molecular Orbitals. Ill 


267 


r* 

z 


r 4 

z 


r 4 

z 


vs 

VC- 


r~ r t 

r^» 


S7 

«—* 

.—■ 

rj 

■c, 

k, 

vs 

sc 

r 4 

r+~, 

r J 

rj 

r 4 

r^, 

r~\ 

vs 

rj 

rl 

r | 

f-y 


© 

d 

<9 


Q — SO O' vr, 

O' *0 -—• ' “O 

q ;r o o '*'> 

o O o —' 

q O rj r4 »c, 


8 5 ? 8 | 

o O r— O 

q O O’ o- © 

5 q q q q 

0^000 

I I 




sc 

■c< 

V 


r+- , 




sc 

sc 

’Z-, 


1 

— 

O 

r 4 

__ _ 



r l 

o 

r- 

O 



o- 

**- 


SC 

r | 

>P ^ - 

o 

sc 

'C 

r*" 4 

r- 


| 

<l 

—■ 

8 

V> 

o 

O’ 

r4 

r 4 




<s 

S7 

—> 

sc 

.— . 


sc 

r— 

r 4 

r 4 

sc 



r- 

r 4 

O' 


r 4 

r 4 

r-'’ , 


9 


r | 

5 

r 4 

r- 

'r , 

z 

-c 

1 

r-r, 

Q 

r 4 
O 

o 

r- 


r- 

/^, 

r- 

| 

z 

r- 


SC 
r J 

SC 

M 

sc 

r 4 



d 


O- 


Q 


° 

O' 

° 

d 

d 


O' 

i 

q 

1 

d 

i 


/-9 

d 

i 

d 

d 


< 

r' 

'*j 


c, 



•Z, 

o 


r< ( 


sc 

8 

sc 


1 

<s 

O 

sc 

O’ 


r 4 

•Ci 

O 

o 




c, 

'r> 

r*’, 


sc 

^-p- 

S7 

SC 

O' 

r-r, 


o 

r- 

o 

O 

>z 

'C, 

O 


<s 

•c, 

VS 


r- 

rr 


O 

sc 


sc 

r— 


sc 

r 4 

** 

r 4 


o 

O' 


r- 

O' 



O 

9 ' 


r \ 

q 

o 

c 

*/- ( 

q 

Z 

8 

r< ( 

^ 1 

o 

r- 

»* 4 

q 

*Z? 

g 



q 

z 

o 

q 

SC 

o 

r 4 

q 

r 4 

^ 4 

q 

— 


d 


d 

O' 

Q 


d 

Q 


d 

o> 


o 

o 

d 

d 

d 


/*—•, 

d 

o 




z 


'Z'* 7J — 7 

O — o 

»o r- — *— 

<& 3 o S © 

d O O’ d 


r o L. o sc 

’— sc r 


'-r, O 


SC r» Op — r4 


z o 


„ . SC *0 

r- $£ sc — 

r\ — O O —' 

^ X 'c, '.r, i.r ( 


£ ~ 
%o 


1 ” % o 

sc o *c* 
— 'C, o- 
M % X 

•/■< O o 


I I 


O" 

r-r, 

sc 

O 

»C/ 


*c, 

r^ 

•c< 

■c, 


r- 


sc 

2? 

sc 


cc 


VO 

tc, 

o 


r 4 


•c, 

■— 1 


r 4 

r- 

sc 

sc 

SC 


O' 

O’ 

r-r, 

y; 


r 4 


SC 

•c* 

r j 

o 

r^* 

O 

O 

z 


vC 

r- 

r- 

r- 

'< K, 

NO 

O' 

O’ 

sc 




«c» 


O' 

O 

sc 

r 4 

r 4 

r 1 

.—. 

*^s 


r*". 

r-r, 

*9* 'c, 

*C> 

VO 

vC 

■o 

*^ 

— 

O’ 

sc 

r 4 

sc 

q 



q 

q 


q 

X 

o 

o 

O' 

O' 




— 

O* 

O' 

q 


«q 

<^, 




O' 

d 


d 

szs 

d 

d 


O' 

d 


d 

d 


o 


d 

Q 



i— o- xr > ' r > 32 

o — rj 'r t ~t 

— r-r, r I r*, r— 

9 SC O O' — — 

'C, O O — 


vC *r, r+~, 
S 7 M 9 
V, vs 

o- 

rj 




vs 

§ 


^ r l »r, y, r ~ 

9 9 e 9 e* 


z z 


z z z 


■c, v; ^ 'o O 

„ vs -1 ^ y £• 

_0 r^* o ^ ^ ' 

9* c x ’t ’t '/•- 

_ rj V-| >c, '-r* 


z z z z z 


'4 


vs r~- 
^c O' 

v^vcr^r- 
— O' sc vs rj 
O O O O vn 


z z z z z 


and *h 7 arc Duncan’s notation for the canonical SCF MO's. 
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the less reasonable angle of 93.7° between two of the equivalent bond hybrids. 
If one applies the equivalent orbital transformation directly to the three 
canonical bonding MO’s (Oj, <J> 5 , <f> 7 in Duncan’s paper), one obtains the 
even more unlikely angle of 165° between two equivalent bond hybrids. 

C. C 2 H 6 and CH 4 (table 3) 

The three types of equivalent LMO’s for C 2 H 6 are given in Table 3 in 
terms of orthogonalized Slater orbitals (Pitzer gives them in terms of a non- 
orthogonal 2s). H,, H 2 , and H 3 are bonded to the C atom, and H 7 , H 8 , and 
H 9 to the C'. The staggered configuration of C 2 H 6 is considered here, and 
H 7 lies trans to H t . The carbon 2pz orbitals are along the C—C' bond axis 
pointed toward each other. The carbon 2px orbitals lie in the H!—C— C' —H 7 
plane. As one would anticipate, the LMO’s turn out to be six equivalent CH 
bond orbitals, a CC bond orbital, and an inner shell on each of the C atoms. 

For CH 4 the exact LMO’s were not obtained. However, since it is certain 
that the valence shell LMO’s are four equivalent CH bond orbitals, and since 
the inner shell can be expected to be quite similar to those in C 2 H 6 , we were 
able to write down a good approximation for the LMO-CMO transformation 
matrix. They are listed in Table 3 in terms of a basis analogous to that in 
C 2 H 6 , for comparison purposes. The C2pz is directed toward H 2 , which is 
taken to occupy a position analogous to that of the C' atom in C 2 H 6 . The 
C2px lies in the H 2 —C—H x plane, with its positive lobe close to Hj. The 
inner shell LMO, (/C), is a linear combination of the (1 a x ) and (2^) CMO’s, 
taken to be such that the normalized component on the C atom is essentially 
identical to the corresponding ones in C 2 H 6 . Thereby a new(2'aj) MO (a linear 
combination of the orginal 1 and 2a x ) is determined which is orthogonal 
to the inner shell LMO. This is then hybridized tetrahedrally with the 1 t x , 
1 t y , and 1 t z CMO’s, to give the bonding LMO’s 6CHj, 6CH 2 ,6CH 3 , and 
bC H 4 . The latter two are not listed, because they furnish no new information 
or comparisons. 

The inner shell LMO’s of CH 4 and C 2 H 6 are seen to have quite similar 
contributions from the hydrogens, which implies that the procedure used for 
estimating the CH 4 -(/C)-LMO is probably quite reliable. In any case, the 
forms of the valence shell LMO’s are not greatly sensitive to the (/C) LMO 
(even though, the exchange effects between the shells are very sensitive), and 
these are what we are primarily interested in comparing. 

A comparison of the 6CHj bond orbitals in the two molecules is very 
interesting. First, the ratio of the coefficients of the 2pz and 2px orbitals is 
slightly larger in C 2 H 6 than it is in CH 4 . This indicates that in C 2 H 6 the 
6CH! orbital is not directed exactly along the CH t bond axis, but the pyramid 
formed by the three (6CH t ) bond LMO’s lies slightly inside the pyramid 
formed by the three CH fc bonds. The angle between two of the C atomic 
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hybrids contained in the (6CH) bond LMO’s is 109.0° in C 2 H 4 , whereas it is 
of course 109.5° in CH 4 . 

Secondly, one finds that the (TiCH^ bond LMO gives a larger contribution 
to the net atomic population (8) of the C atom than to that of the H atom, 
which is in agreement with the general feeling that the carbon atom is more 
electronegative than the hydrogen atom. The net atomic contributions are 
0.543 for H and 0.754 for C in CH 4 , and 0.581 for H and 0.721 for C in C 2 H 6 . 

The largest coefficient of the (His) orbitals in the bond LMO’s of C 2 H 6 
differs only by about 3% from that in the CH 4 bond orbitals. This is not 
surprising, since it is generally assumed that the CH bond orbitals are virtually 
identical in all saturated hydrocarbons, and this assumption is the basis of 
many semiempirical theories. One would have guessed that the hydrogens are 
slightly more positive in ethane than in methane because, in the former, 
there are fewer hydrogen atoms available to donate charge to the more 
electronegative carbon atom. The present results would indicate, however, a 
weak effect in the opposite direction, i.e. the hydrogens in CH 4 are slightly more 
positive than those in C 2 H 6 . This can also be seen from a simple population 
analysis using Mulliken’s definition of net atomic populations (8). One finds 
hydrogen populations of 0.628 and 0.592 in C 2 H 6 and CH 4 , respectively. 
These atomic populations contain of course contributions from all LMO’s. 

Since the same type of basis functions are used in both molecules (Slater 
orbitals, i.e. Slater-type orbitals with Slater exponent values), it seems 
unlikely that the decrease of the hydrogen positivity in going from CH 4 to 
C 2 H 6 should be a peculiarity of the approximation and absent in the exact 
SCF solution. Nevertheless, it would seem to be of interest to know whether 
the effect remains true for the exact SCF wave function. 

III. Localized Orbitals and Equivalent Orbitals 

Historically, Hund (9), Coulson (10), and Lennard-Jones (77), both alone 
and with Hall (72) first discussed the idea of equivalent orbitals to establish 
correspondence to chemical intuition. Equivalent orbitals are defined by the 
property that any symmetry operation of the molecular symmetry group 
merely permutes the equivalent orbitals. Later Lennard-Jones and Pople (75), 
realizing that this concept was of little help within one symmetry species, 
proposed the self-energy criterion for localization within one species. Since 
then it seems to have been frequently assumed that equivalent orbitals always 
satisfy this criterion (14). The following examples will show that there are 
conditions under which this assumption is not justified. 

A. Nonequivalent localized orbitals in the case of s-p hybridization 

It is generally imagined that the two digonal hybrids 
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/»+=( s + p)/V 2 » /?-=( s-p)/V 2 

are more localized than the original orbitals, s, p. However, such is the case 
only if there exists a sufficiently large local overlap between the s-orbital and 
the p-orbital because, then, hybridization can be effective in decreasing the 
local overlap. If on the contrary, the s-orbital and the p-orbital have very 
small local overlap, then hybridization would tend to increase it (by making 
the orbitals comparable in size) and thereby decrease the localization. Thus 
it is to be expected that the s-orbital and the p-orbital themselves represent the 
LMO’s in the case that one is very much more contracted than the other.* 
For a quantitative investigation, we fall back on Eqs. (15)ff of our first 
communication (/). Choosing (p 1 = s and <p 2 = p in those equations, we find 
that transformation from (s, p) to (/? + , /z_) changes the localization sum by 

AD = D(h+, hJ) - D( s, p) = 2[sp|sp] - i[s 2 - p 2 |s 2 - p 2 ]. 

The LMO’s are identical with ( h+ , /i_) if AD > 0, and with (s, p) if AD < 0. 
In the following, two cases are being considered: (s = Is, p = 2p) and 
{s = 2s (-Slater), p = 2p}. Let ( s and ( p be the orbital exponents of the two 
functions, and 

C = «t, + Q, 

1 = (C, - (,)/((, + v = [1 - <Cp/C,)]/[l + ((p/C,). 

Then one obtains 

2[sp|sp] = Uf(t), i[s 2 - p 2 |s 2 - p 2 ] = CC(T) 
where, for s = Is, p = 2p; 

16Jf(t)-j(l + t) 3 (l -i) ! , 

16 C(t) = 5(1 + z) - i (1 - t 2 )(14 - 7z - t 2 + - t *) + (l _ t); 

and, for s = 2s, p = 2p 

1 

128 X(x) = — (1 - t 2 ) s , 

128 C(r) = ” (1 + t) - i(l - t 2 X93 - 47t 2 + 23t“ - Sr”) + ^ (1 - T ). 

* This section was stimulated by A. C. Hurley’s remark that, according to unpublished 
observations in C. A. Coulson s laboratory, s-p hybridization decreases localization, when 
the 2s functions are exceedingly contracted. 
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Figure 1 gives AD/C as a function of x for the two cases and shows that the 
LMO’s are digonal hybrids h ± if and only if 

— 0.601 < x < 0.108, i.e. 0.249 < £(ls)/((2p) < 1.24 in the ls-2p case, 

— 0.352 < x < 0.367, i.e. 0.479 < C(2s)/C(2p) < 2.16 in the 2s-2p case. 



Fig. 1 . Localization sum changes (lines) and approximation by a linear function of the 
local overlap (points). Full line and circular points: 2s-2p hybridization. Dashed line and 
triangular points: ls-2p hybridization. (Energy in atomic units). 

The relation to local overlap can be formulated quantitatively by comparing 
the localization sum changes (AD/C) with the overlap integrals between the 
s-orbitals and the absolute value of the p-orbital, i.e., 

S( Is, |2p|) = J</F(ls) |2p|, 

S(2s, |2p|) = jdV(2s) |2p|. 

One finds that, in atomic units, one has approximately 

(AD/C) ~ 0.621 IS — 0.3914, 

for the Is case as well as for the 2s case. The close correspondence is exhibited 
in Fig. 1 by inserting selected values of (0.6211 S - 0.3914) as circular points 
for S (2s,|2p|) and as triangular points for S (ls,|2p|). In all cases, the digonal 
hybrids represent the localized orbitals whenever, approximately, S > 0.63. 
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Since it is unlikely that ((2s) and ((2p) differ by as much as a factor 2, it 
can be expected that digonal hybrids usually represent localized orbitals 
in the valence shell. In contrast, there is probably no case where ((Is) is as 
small as twice ((2p), and hence localization never leads to ls-2p digonal 
hybrids. This latter circumstance can be considered as fortunate because it 
presumably has the consequence that the inner shell LMO of an atom within 
a molecule will remain relatively unaffected by directional variations of the 
bonding LMO’s i.e. the inner shell will be pretty much the same in different 
molecules, even though 2s-2p hybridization may change. This fact is 
indeed found to be the case for the systems investigated in this and a 
previous paper (2). 

B. Nonequivalent localized orbitals in C 2 (table 5) 

In the case of the \a 2 \a 2 2a 2 2aln 2 n 2 , l l, g ground state of the C 2 molecule, 
it is difficult to guess what type of equivalent orbitals should exhibit maximum 
localization. It was therefore considered of interest to analyze the C 2 calcula¬ 
tion of Ransil (15) from this point of view. Application of the localization 
technique to Ransil’s calculation yields the results given in Table 5, which is 
arranged in the same manner as Tables 1 and 2. It is readily seen that the 
LMO’s of C 2 are not equivalent orbitals of any type. 

The localization sum D is found to be 9.5188 + 0.0001 H for the LMO’s. 
Except for the inner shell orbitals, it was not possible to determine the LCAO 
expansions of the LMO’s to more than about two figures. When the starting 
orbitals were changed, the localization procedure resulted in somewhat 
different orbitals, indicating that, here, one can have variations of the men¬ 
tioned magnitude in the orbitals without changing D by more than 10“ 8 H, 
the limit in sensitivity of the program used. For each of the sets of localized 
orbitals found, D was found to have the same value. 

In order to establish whether any set of equivalent orbitals would be as 
localized as the LMO’s, we examined D for various possible types of equiva¬ 
lent orbitals. In all these cases the valence orbitals alone were treated and 
the inner shells taken to be those given by the localization procedure for the 
LMO’s mentioned above. The localization of the inner shells alone results in 
the configuration (/C) 2 (iC) 2 (2 6 g ) 2 (2 d u ) 2 (n u ) 2 (n u ) 2 , where the bars on 2 6 g 
and 26 u indicate that these orbitals have been changed somewhat by the 
localization of the inner shells. The localization of the inner shells changes D 
from 6.0315 H, for the canonical SCF MO’s, to 9.3092 H for the LMO’s. 

One possible set of equivalent orbitals can now be obtained by left-right 
combination of (26 g ) and ( 26 u ). The electronic configuration can then be 
written as (/C) 2 (/C') 2 o\ crin 2 7t 2 , where 

(7+ = (26 g + 26 u )/j2, <t_ = (26 g — 2<t u )/V 2. 
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Note: The symbols vlCC', v2CC\ v3CC', vACC' denote the four localized valence molecular orbitals. 
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The localization sum D is found to increase to 9.3397 H. 

Another possible set of equivalent orbitals is obtained by trigonal hybridi¬ 
zation between the tc-MO’s and a a-MO 

<r a = cos a(2a g ) + sin a(2a„) 

where the angle a remains to be determined. The electronic configuration 
can be written {iC) 2 {iC') 2 , where 

o' a = —sin a(2<r ? ) + cos a(2d u ) 

is orthogonal to cr a and 

4 = 0 « + \/2 n k )/y/3 
are polarized banana bond orbitals with 

n \ — n u> 71 2 = i(~ n u + v^n u ), n 3 = i(~ n u ~ \f^n u ). 

The localization sum 

D = 3 [ 4 /J 4 /J + [a»X] + 2[iC/C|tCtC] 

becomes 

D = A cos 4a + 5 sin 4a + C cos 2a + F sin 2a + G 

with 

A = Ugg ~ uu\gg ~ uu] - %[gu\gu] 

B = i[gg- uu\gu] 

C = its# ~ uu\gg + uu - 2nn] + %[gn\gn] - f[r/7r|r/7r] 

-2 

F= — [gu\gg + uu ~ Inn] + f[g7r|t/7r] 

G = Ugg ~ uu \gg ~ uu] + i{2[gg\uu] + [ggkrc] + [uu\nn] + [gu\gu] 

+ 2[gn\gn] + 2[un\un] + 2[7T7r|7r7r]} + 2[iCi'C|iCtC] 

where g, u, n denote ( 2d g ), (2d„), n u , respectively. By virtue of the g — u 
symmetry, one finds B=F= 0, and further substitution of the numerical 
values 

\.gg\gg\ = 0.61377 [gg\uu]= 0.44816 [gu\gu] = 0.06372 

[uu\uu] = 0.47636 [gglrcTr] = 0.51835 [gn\gn] = 0.09328 

[ 7171 1 tttt] = 0.49608 [uu\nn] = 0.43980 [wr|mr] = 0.04525 

[iCiC'IOC'] = 3.61346 
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yields 


D = -0.01018 cos 4a + 0.07060 cos 2a + 9.44876. 

The cos 2a term dominates the variation with a and maximal localization 
occurs for a = 0. The localized configuration is therefore (/C) 2 (/C') 2 ti ! t|/|(2d : u ) 2 
where t k are unpolarized banana bonds, the value of D being 9.5092 H. It is 
of interest that the minimum value within trigonal hybridization is 9.3680 H 
and, thus, still higher than that for the (a + ,aJ) hybridization discussed 
earlier. It occurs for a = \n, i.e. for the configuration (iC) 2 (iC') 2 (t 'y) 2 (t 2 ) 2 
(l 3 ) 2 ( 2 < 7 g ) 2 , the t' k being n-( 2 a u ) hybrids. 

Thus, the localization of the orbitals in the (/C) 2 (/C') 2 tft|tf(2a u ) 2 con¬ 
figuration is almost as strong as that of the true LMO’s (i.e. 9.5188 H), but 
they are definitely not identical with the LMO’s. One can imagine other 
possible hybridizations to equivalent orbitals, such as that of the 2 d g orbital 
with only one of the n u orbitals, but these are surely less localized. For the 
purposes of semiempirical or pair theories, there might be an advantage to 
work with a set of equivalent orbitals; where localization is important, 
however, one would want to choose that set of equivalent orbitals which is 
most localized. 

C. Nonequivalent localized orbitals in benzene 

In the ground state of benzene, the six pi-electrons occupy the delocalized 
canonical MO’s 

(a) = N a {y i + X2 + X3 + X4+ Xs + Xe}> 

(ex) = N x {xi -X3-X4 + Xe), 

(ey) = N y {y t + 2 x 2 +X3-X4- 2 Xs ~ X 6 )> 

where y A ••• y 6 are the atomic p-orbitals and N k normalization constants. 
Within the molecular symmetry group D 6h , the orbital (a) transforms accord¬ 
ing to the representation A 2u , and the orbital pair (ex), (ey) according to 

Hall and Lennard-Jones (16) pointed out that it is possible to construct three 
equivalent orbitals k if k 2 , k 3 which can be formulated as follows 

k, = (i Y'\a) + (i)*' 2 («0 
k 2 = C 3 k±, k 3 = C 3 l ki, 

where the operator C 3 represents a rotation by 120°. These orbitals correspond 
to the classical Kekule structures in as much as contains strong and identical 
contributions from the two adjacent atoms 1 and 6, and lesser contributions 
from the other atoms. 

Later Allen and Shull (17) remarked that another set of equivalent orbitals, 
l y , l 2 , / 3 , can be formulated, which we Can express as 
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/i = (i) 1/2 0*) + Q) 1/2 (ey)> 

/ 2 = C 3 f, i 3 = c 3 1 / 1 . 

Here, the orbital / x has a major contribution on the one atom 2, somewhat 
smaller identical contributions from atoms 1 and 3, and lesser contributions 
from the remaining atoms. The question arises whether this set is less or more 
localized than that proposed by Hall. 

In trying to apply the localization procedure, we noted that these two sets 
are actually equally localized. In the following, we shall show that, in fact, 
they represent only two special cases of an infinity of sets of equally localized 
equivalent orbitals (18). 

The most general set of equivalent orbitals, which can be formed here, is 
(Pi = (i) 1 / 2 0 ) + (f) 1/2 [cos <x(ex) + sin a (ey)] 

(pi = C 3 (p u q> 3 — C 3 1 i 

where a is an arbitrary angle, 0 < a < It is readily verified that these 
three orbitals are mutually orthogonal. They are equivalent orbitals in the 
group C 6 but, if a ^ 0, \ n, they are not equivalent orbitals in the group 
It can now be proved that the localization sum 

D(<p) = 3[<p\\cp{} 

is independent of a, which means that all possible sets of this type are, in 
fact, equally localized. 

The proof rests on the following transformation properties of orbital 

transforms according to A lg , 
transforms according to A lg , 

transform according to E lu , 

transform according to 

Consequently the following identities hold between electron interaction 
integrals 

[Po\PkQ = [Pi.\p k f\ = 0 | for ( * = 2, 3; k = 2, 3; 

[p&lpM = ) U = *, y; q = x,y; 

Using these identities and others resulting by combination, for example, 
[ex 2 \ex 2 ] - [ex 2 \ey 2 ] -2[ex ey\ex ey] = 0, 


products in the group D 6h : 

Po = (a) 2 

Pi = [(ex) 2 + (ey) 2 ] 
p 2 x = (a) 0 *)j 
Piy = (a)(ey)j 
p 3 x = (ex) 2 - (ey) 2 ! 
p 3 y = 2 (ex)(ey) j 
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one obtains 

3D — [a 2 \a 2 ] + 4(cos 4 a + sin 4 a)[(ex) 2 |(ex) 2 ] + 8 cos 2 a sin 2 a[(ex) 2 |(e/) 2 ] 

+ 16 cos 2 a s'm 2 a[(ex)(ey)\(ex)(ey)] + 8(cos 2 a + sin 2 a)[a(ex)|a(ex)] 

+ 4(cos 2 a + sin 2 a)[a 2 |(ex) 2 ], 

3D = [a 2 \a 2 ] + 4(cos 2 a + sin 2 a) 2 [(ex) 2 |(ex) 2 ] 

+ 8 cos 2 a sin 2 a{ — [(<?x) 2 |(<?x) 2 ] + [(ex) 2 \{ey) 2 \ + 2[{ex){ey)\{ex)(ey]} 

+ %[a{ex)\a{ex)} 4- A[a 2 \(ex) 2 ], 

3D = [a 2 \a 2 } + 4[(ex) 2 |(ex) 2 ] + 8[a(ex)|a(ex)] + 4[a 2 |(ex) 2 ], 
which is indeed independent of a. 

It may be noted that the lone pairs of the F 2 molecule, mentioned in a 
previous paper (2), represent another example of equivalent sets of equivalent 
orbitals. 
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Abstract. This work presents an interpolation method for obtaining potential surfaces 
of multicenter systems using information on the united atoms, separated atoms, and atomic 
associations limits. The interpolation formula given by Eq. (13) is a rational function of 
the nuclear coordinates. Its main advantages are that it leads to a set of linear equations 
for the determination of the surface’s parameters. The computational work involved is 
fairly simple. Its usefulness will depend on the future requirements of the theories of mole¬ 
cular spectroscopy and molecular reactions. 

Es wird ein neuer Ansatz fur die Energiehyperflachen von mehratomigen 
Systemen angegeben und diskutiert, der im Prinzip fur beliebige Anzahlen 
von Atomen verwendet werden kann und alle liber eine Energiehyperflache 
bekannten Informationen zu beriicksichtigen gestattet. 

Das Verfahren wird an dem Systeme und den Molekiilen H 2 , He, H, H 
und erlautert und es wird gezeigt, daB schon mit sehr einfachen Ansatzen 
die Energiefunktionen von He, H, H und H 2 in guter Naherung erhalten 
werden. Mit einem verbesserten Ansatz kann die Potentialkurve fur H 2 in 
einer bisher noch nicht vorliegenden Genauigkeit berechnet werden. 

I. Einleitung 

Kaum ein anderer BegrifTist in der theoretischen Molekiilspektroskopie so 
fundamental wie der der Energiehyperflache oder Energiekurve. Er basiert in 
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seiner Begriindung auf dem Naherungsstandpunkt, daB die Atombewegungen 
wesentlich langsamer verlaufen als die Bewegungen der Elektronen (Born- 
Oppenheimer-Naherung (7). Aus diesem Grunde kann in sehr guter Naherung 
fur jeden Elektronenzustand (k) eine Gesamtmolektilenergie 8 k als Funktion 
der Kernlagen 0t = {R t , ... , R F } im Raum eingefiihrt werden. R steht hier 
fiir die Gesamtheit aller F unabhangigen Kernparameter. Dabei ist allgemein 
F = 3 N — 6, wenn N die Anzahl der betrachteten Atome ist. Diese Zahl 
verringert sich bei eingeschrankten Bewegungen der Zentren und betragt 
zum Beispiel F = N — 1, wenn alle Atome auf einer Geraden liegen sollen. 
/N\ 

Allgemein gibt es I ^ I Kernabstande R Xfl zwischen dem A-ten und /r-ten 

Atom (A, fi = 1,TV). Von denen sind aber nur F unabhangig. N — F 
Kernabstande lassen sich also durch die iibrigen Fausdriicken! Man bezeichnet 
8 k {0l) als Energiehyperflache. Nur fiir F= 1 bzw. 2 liegt eine Kurve bzw. 
Flache vor. Fiir F> 2 sprechen wir von Hyperflache, weil sich diese Funktion 
in einem F + 1-dimensionalen Raum als solche interpretieren laBt. Die 
Gesamtheit aller Vektoren zu den Zentren wollen wir mit 2%' = {2? x }, 
(A = 1, ..., N) bezeichnen. 

Die 8 k ergeben sich als Energieeigenwerte der Schrodingergleichung (Wel- 
lengleichung) 

^ k = 8' k {^ k , ( 1 ) 

wenn der Hamiltonoperator des Elektronensystems bedeutet, wobei die 
Atomkerne als festgehalten gedacht werden (Kernkoordinaten als Parameter). 
In S' kann im Prinzip zu den 01 iibergegangen werden, so daB 

8'{01') = 8{0l). (la) 

Im Rahmen der oben angegebenen Naherung ergeben sich die Kernbewegun- 
gen (Rotation und Schwingungen) aus der Wellengleichung 

{k + 8' k mh ki m = 8 kJ x kj m, ( 2 ) 

wenn K der Operator der kinetischen Energie der Kerne bedeutet. 8 k spielt 
in (2) die Rolle des Potentials, in welchem sich die N Kerne bewegen. 8 kj sind 
damit die Energieeigenzustande des Kerngeriistes, wobei sich der EinfluB der 
Elektronen auf die Bindungen zwischen den Atomen (Wechselwirkungen 
zwischen den Atomen) in 8 k zeigt. Auch die Wellenfunktionen X kj der 
Atomkerne konnen in 0t umgeschrieben werden, genauso wie 8' in (la). Die 
Wellenfunktion der Elektronen i// k hangt neben den Elektronenkoordinaten 
r auch von 0t bzw. 0t' ab, die in (1) als Parameter auftreten! 

Die Born-Oppenheimer-Naherung ermoglicht an Stelle der zeitabhangigen 
Schrodingergleichung die beiden zeitunabhangigen Gleichungen (1) und (2) 
zu verwenden, wenn stationare Zustande des Systems vorausgesetzt werden. 
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Es laBt sich zeigen, daB exakt gilt 


8'{01') = E(St) + W 


(3) 


wobei 



(3a) 


und Z A und die jeweiligen Kernladungszahlen bedeuten. E stellt die 
sogenannte reine Elektronenenergie (ohne KernabstoBungsenergie W) dar. 
Sie ist, im Gegensatz zu S (bzw. S') im ganzen ^?-Raum (^?'-Raum) endlich 
[beschrankte Funktion S{0£)\. 


II. Das Problem 


Um die Zustande des Kernsystems auszurechnen, rnuBte S' k aus (1) bestimmt 
werden. Danach ware dann S' k in (2) einzusetzen und Gleichung (2) zu 
behandeln. Die Berechnung von S' k aus der Wellengleichung des Elektronen- 
systems stoBt auf groBe Schwierigkeiten, da die Gleichung wegen der groBen 
Anzahl von Elektronen hochdimensional ist. Man kann diese Schwierigkeit 
naherungsweise umgehen, wenn man fur S k eine Approximation S k ansetzt 


(4) 


die vorerst noch freie Parameter <Xj enthalt, welche durch Forderungen an S 
bestimmt werden konnen 


S — S(R k , , Rp <Xj). 


(4a) 


Zur Zeit sind nur sehr wenige Potentialkurven vonzweiatomigenMolekiilen 
bekannt und diese auch nur naherungsweise. Man erhalt diese entweder aus 
spektroskopischen Daten (2) unter Verwendung der Rydberg-Klein-Rees 
Methode (2) oder aus a6-/w7w-Rechnungen, dann aber bisher nur in der 
Umgebung des Energieminimums. Neuere ab-initio-Yerfahren ( 4 ) erlauben 
dagegen eine Reihe von <f-Werten mit befriedigender Genauigkeit auch bei 
Systemen mit mehr als zwei Atomen und auBerhalb der stabilen Konstellation 
der Kerne zu berechnen. Da aber bisher kein analytischer Ansatz /fiir N > 2 
vorlag, war es nicht moglich, Aufschlusse iiber hoherzentrige Energiehyper- 
flachen zu erhalten, die nicht nur fur spektroskopische Fragen, sondern 
besonders zur Diskussion von Reaktionsvorgangen notwendig sind. 

Die bisher bekannten Ansatze /, wenn F= 1 (N = 2), reichen fur viele 
Anspriiche der Spektroskopie aus, sie nutzen aber nicht alle Kenntnisse aus, 
die wir in diesem Falle iiber / besitzen, besonders dann, wenn Rechnungen 
auBerhalb der Gleichgewichtslage vorliegen. Aus diesem Grunde ist der 
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Verlauf dieser Energiekurven nicht in alien /?-Bereichen zufriedenstellend. 
Dies bedeutet wiederum, daB bestimmte chemische und physikalische 
Vorgange, die sich gerade in diesen i?-Bereichen abspielen, ungeniigend 
erfaBt werden und daB ein eventueller Aufbau von Energiehyperflachen 
(siehe Abschnitt III) aus niederzentrigen <?-Funktionen nicht moglich ist. Die 
Aufgabe besteht also darin, einen Ansatz 8 nach (4) zu finden, der in alien 
i?-Bereichen eine gute Approximation ermbglicht und alle Forderungen zu 
erfiillen gestattet, die sich jeweils an 8 stellen lassen, um die freien Parameter 
a j zu fixieren. Im Gegensatz zu den bisherigen Ansatzen (5) 8 fiir N = 2 
(F = 1), die auf Grund ihrer analytischen Formen nicht alle Forderungen 
erfiillen konnen, muB somit ein 8 gefunden werden, was nicht nur alle zu 
stellenden Forderungen erfUllt, sondern sich auch auf die Falle N> 2 
erweitern laBt. 

Im Falle zweier Zentren (TV = 2, F = 1) lassen sich folgende Forderungen 
an 8 stellen, die von 8 erfUllt sind, wobei R = R 0 den Bindungsabstand 
bedeutet: 


d_8_ 

dR 

d j 8 


= 0 , 


= k (j \ 


dR J 

M V U 

8(R 0 ) = B + 8(a\b), 

E = E(ab ) + f EjR j 

j= 1 

8 = 8{a\b)+ f ^ 


(R < l), 


(5a) 

(5b) 

(5c) 

(5d) 

(5e) 


Daneben tritt noch die Forderung, daB 8 die Form nach (3), (3a) haben soli. 
k U) bedeutet die “ Kraftkonstante ” (“hohere Kraftkonstanten ” fiir j > 2). 
Im einzelnen bedeuten noch 


lim E(R) = E(ab) (6 a ) 

K-0 

und 

lim 8(R) = 8(a | b) = E{a \ b). (6b) 

R-> x 

Die beiden Energien E(a\b) and E(ab ) sind diejenigen der getrennten und 
vereinigten Atome (Modell des “separated” und “united atoms”). B ist 
die Bindungsenergie. Die KLoeffizienten Ej und Cj konnen naherungsweise aus 
Storungsrechnungen bestimmt werden (6,7). 

Es werden also fur N = 2 die GroBen k (J) , B, R 0 , E(ab), £„ E 2 , ... und 
E(a\b), e u e 2 ,... zur Bestimmung der a y in (4a) herangezogen. 
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Fiir N > 2 (F > 1) sind die entsprechenden Forderungen an £, die von £ 
schon erfiillt werden, komplexer. Wir erhalten zuerst einmal 

(7a) 
(7b) 

und 


d£ 

dRi 

d j £ 


= 0 


<*( O) 


8Rj 


= k ( /> 


£(@ (0) ) = B + £(a\b\c\---\N), (7c) 

wobei ^? (0) fiir die Gesamtheit aller Rj(j = 1, F) steht, die in der stabilen 
{Constellation der Kerne eingenommen werden 

^ 0) = {R{°\ ... ,R ( f 0) }. (8) 

Das System nimmt dann die tiefste Energie ein, die um B tiefer liegt als die 
der getrennten Atome, die mit £(a\b\c\ •••|iV) bezeichnet wurde. Die Kraft - 
konstanten sind im Hinblick auf die Koordinate R ( definiert. In beiden 
Gleichungen (7a) und (7b) gilt /= 1,2, F, so dab schon F(1 + G) + 1 
Forderungen an £ vorliegen, wenn j in (7b) bis G lauft. 

Beziiglich der Erweiterungen der Gleichungen (5d), (5e) bzw. (6a) und (6b) 
hilft die Vorstellung der Atomassoziationen weiter (5). Danach kann fiir (6a) 
und (6b) geschrieben werden 

lim E(M) = E(K), (9) 

[K] 

wenn E(K) die Elektronenenergie der Assoziation [K] bedeutet. Die Anzahl 
der Atomassoziationen steigt rasch an; so existieren fiir N = 2 noch zwei 
[nach (6a) (6b)], ftir N = 3 sind es fiinf: 

E{a \b\c ) E(c | ab) 

F(a\bc ) E(abc). (10) 


E(b\ac) 

Liegen vier Atome vor, so konnen schon 15 Assoziationen aufgeschrieben 
werden 


E{a \b\c\d) 

E(b \ac\d) 

E(d\abc ) 

E(a | cd | b) 

E(a | be | d) 

E(c | abd ) 

E(a \bd\c) 

E(ad\bc ) 

E{b\acd) 

E(b \ad\c) 

F{ac | bd) 

E(a | bed) 

E(c \ba\d) 

E{ab | cd) 

E{abcd). 
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Darin werden die Atome mit a, b, c und Jbezeichnet und E{a\bc\d) bedeutet 
die Elektronenenergie eines Systems, in welchem sich die Atome b und 
c zu einem neuen Atom be (Teilvereinigung) mit der Kernladung Z b + Z c = 
Zbc vereinigt haben, wobei die drei Atome a, be und d unendlich weit von- 
einander entfernt sind. Die Gleichung (9) stellt den Ubergang zu einer As- 
soziationsenergie dar. 

Diese Oberlegungen fiihren zu dem oben angedeuteten “ Baukastenprin- 
zip”! Wird nun zu den sogenannten unvollstandigen Atomassoziationen 
[. K'] ubergegangen 

lim E(M) = E(K% (12) 

[K'] 

indem nur die zu jedem Ubergang lim^.-, [vergl. (9)] gehorenden Ubergange 
R 2fl -* 0 durchgefuhrt werden, aber das so erhaltene Atomsystem seine neuen 
und alten (urspriinglichen) Atome in endlichen Abstanden laBt, so folgen 
daraus aus dem anfanglichen £ neue Energiehyperflachen, die von geringerer 
Zentrigkeit sind. Man kann also Rechnungen an kleineren Systemen ver- 
wenden, um groBere damit in der <f-Darstellung zu erfassen! 

Was die Gleichungen (5d) und (5e) anbetrifft, so sind ihre Erweiterungen 
auf N> 2 darin zu sehen, daB gegebenenfalls beim Obergang zu einer As- 
soziation, wenn einige sehr klein und andere sehr groB werden, entsprech- 
ende Darstellungen wie in (5d) und (5e) flir die jeweils sehr nahen oder 
entfernten Atome aufgeschrieben werden. Auch im Rahmen der unvollstan¬ 
digen Assoziationen lassen sich solche Beziehungen aufstellen. 

Das Problem besteht nun darin, einen Ansatz von dieser Flexibilitat zu 
linden! 


111. Die Losung 


Ein Ansatz $ nach (4a), der alle Forderungen erfiillen kann, ist der fol- 
gende (9): 


Mi,M2, 

I «/,./.. fM'R? - Rf r' 

P _ fi,f2,...f F =0 

^ mTmI .Mr ' 

S .- H" 

/l,/2,.../r- 0 


(13) 


wobei £ nach (3) und (3a) gegeben ist 


£=E+W. (13a) 

Die Summierungsgrenzen M i ,M 2 , ..., M F sind, wie die a und a' noch frei und 
konnen jeweils vorgegeben werden, im Hinblick darauf wie genau die Approxi¬ 
mation bei ausreichend vorhandenen Forderungen sein soli und wie groB der 
Rechenaufwand zulassig und berechtigt ist. 
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Bemerkenswert an dem Ansatz (13), (13a) ist die Tatsache, was hier nicht 
naher ausgefuhrt werden soil, daB alle hier diskutierten Forderungen zu 
linearen Gleichungen in a und a' fiihren] Fur F = 1 (N = 2) geht (13) iiber 
in {10) 

M 

E “A 

E = i -, (14a) 

X> 5 *' 

j= 0 

und IF ist dann gegeben durch 

(Mb) 


Liegen drei Zentren vor, so hat der Ansatz die folgende Form (F = 3): 

Mi M 2 M 3 

LEE 0i klm R\R‘ 2 R" } 

E- »-oi-°»-o _ (15a) 

L Mi M 2 M 3 V ' 

E E E <i .*1*2*" 

fc = 01 = 0m = 0 


wobei 


Ri 


z £ z c + 

R-2 


Z b Z c 
Rz ' 


(15b) 


Dabei haben wir angenommen, daB die Kernabstande wie folgt zwischen den 
drei Atomen a,6 und c definiert sind 


a 



Abb. 1 


Die Kenntnis dieser Flache erlaubt die Reaktion 

a + bc^ab + c (16) 

zu diskutieren. Im Rahmen der unvollstandigen Assoziationen sind bei der 
Berechnung von (15a), (15b) die Energiekurven von ab-c, ac-b und bc-a 
erforderlich. Geht ein Atom nach unendlich, so bleiben noch die Energie¬ 
kurven von a-c, a-b und b-c iibrig. 
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IV. Beispiele 


A. Das H 2 -Molekul (10) 

Mit M = 6 in (14a) wurden 13 Forderungen erfiillt. Dazu waren die 
GroBen (in at.E.) 


E(ab) = -2,9037 

k = 0,277 

e 2 — 0 

E x = 0 

O 

f—H 

ii 

o 

e 3 =0 

E 2 = 3,792 

B= -0,1744 

e 4 = 0 

E(a\b)= -1,0000 

e 1 = 0 

e s = 0 


e 6 = -H,0 


(17) 


erforderlich, die hier in atomaren Einheiten angegeben werden. Fur e 6 
wurde anstelle des richtigen Wertes —6,4999, reprasentativ fur die folgenden 
e 7 -Werte, die negativ (oder Null) sind der Wert nach (17) verwendet. Man 
erhielt dann aus den linearen Gleichungen fur ocj und a) die Werte 


a 0 = -2,9037 
a x = -1,804 
<x 2 = “I - 0,355 
a 3 = —0,043 
a 4 = —0,901 
oc 5 = +0,566 
a 6 = -0,117 


<Xq — + 1,000 

a[ = +0,622 
oc 2 = +1,183 
a 3 = -1,539 
a 4 = +1,582 
a' 5 = -0,681 
« 6 = +0,117, 


(18) 


mit denen eine ausgezeichnete Ubereinstimmung mit der wirklichen Energie- 
kurve erzielt wurde, wie ein Vergleich mit sehr genauen a6-/w7/o-Rechnungen 
(11) zeigt (Tab. 1). 


TAB. 1 


R 

1 

S 

0,5 

-0,520 

— 

1,0 

-1,124 

-1,124 

1,5 

-1,173 

-1,173 

2,0 

-1,138 

-1,138 

2,5 

-1,092 

-1,094 

3,0 

-1,049 

-1,051 

4,0 

-1,010 

-1,013 


B. Das System He, H, H 

Hier soil vorerst nur eine sehr grobe Naherung fur 8 gegeben werden. Es 
soil vielmehr der Gang der Berechnung gezeigt werden. Aus diesem Grunde 
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setzen wir in (13a) M t = M 2 = M 3 = 1 und (15b) hat hier die Form 

„ 7 2 2 i 

W =- v - v — 

Ri Ri Rz 


(19) 


wenn 



H 


Abb. 2 


Aus Symmetriegriinden muB in diesem Falle die Energie invariant gegenUber 
der Vertauschung der zwei H-Atome sein, wir haben also 

®klm = &lkm> ®klm ~~ ®klm ( 20 ) 

und E nimmt die Form an 


a ooo + a iooC^i + R2) + a 001^3 + a i 10^1^2 

£ _ + a 101^3(^l + ^ 2 ) + a 111^1^2^3 ^21) 

1 + a ioo(-^l + R 2 ) + a 001^3 + a i 10^1^2 

+ a 101 ^ 3 (A + R 2 ) = a ll 1-^1 ^ 2-^3 

weil 

a 100= a 010> a 101= a 011> a 100= a 010> a 101 =a 011" (21s) 


Gehtdas He-Atom gegen unendlich (R lf R 2 - 

g _ a 100 + a 111^3 \ . 

a i 10 "F a m ^3 R 3 

andererseits resultiert . 

g_ a ioi + a 111^2 _j_ 2 > 

a ioi -F a i 11^2 R2 


00 ), so geht(21) mit(19) iiber in 
H 2 -He (22) 


H -He - H 


(23) 


wenn ein H-Atom entfernt wird (R lt R 3 -»• 00 ). Die Entfernung des anderen 
fiihrt, wegen (21a) zum gleichen Ergebnis. 

Die Vereinigung von He und H liefert in (21) 

p _ 3qoo ~t~ flioo^ ~F 

1 + ®ioo^ "F Ofioi^ 


( 24 ) 
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weil 7?j -»0 und 7? 2 = 7? 3 — R■ In (24) liegt dann, wenn das entsprechende W 
addiert wird, 

$ — ^ooo ^ g ioo^ + «ioi^ 2 _j_ 2. (24a) 

I + <*100^ + a 101^ 2 ^ 

die Approximation fur die Energiekurve des LiH-Molekuls vor. 

Entsprechend erhalt man die Naherung fiir He ••• He, wenn 7? 3 ->0 und 
Ri = R 2 — R gezeigt wird: 


y _ «ooo + 2« 1 ooi? + a 1 i 0 i? 2 4 

1 + 2a' 100 i? + a [ 10 R 2 R 


(25) 


Aus den Gleichungen (22) bis (25) lassen sich dann die a klm und <x' klm bestimmen. 
Man erhalt schlieBlich 


- 14,670 - 7,5(7?! + R 2 ) - 10,7547? 3 - 5,807 J R 1 .R 2 

E = _ - 5,904*3(7?! + 7? 2 ) ~ 3,123j? 1 7? 2 7?3 

1+7?i+7? 2 + 1,5077?3+ 7?i7? 2 + 7? 3 (7?i + 7? 2 )+ 0,87?i7? 2 7? 3 ’ ' ’ 

was sicher, mit (19), eine sehr grobe Naherung ist. Verbesserungen werden 
dann durch Erhohung der Mj -Werte erreicht und durch Beriicksichtigung 
weiterer Informationen, besonders durch bessere Naherungen der zweizentri- 
gen Systeme. Leider liegt zur Zeit die He-H-Kurve noch nicht ausreichend 
genau vor. 

Werden die drei Atome auf eine Gerade gelegt, wobei das He-Atom auBen 
liegt, so ergibt sich qualitativ folgende Energieflache (7? 2 = 7? t + 7? 3 ) 



Abb. 3 

Bei der linearen Anordnung mit mittlerem He-Atom resultiert (ebenfalls 
qualitativ), (wobei 7? 3 = 7? t + 7? 2 ): 



Abb. 4 
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Man wird in Anbetracht der groben Naherung das schwache Minimum nicht 
allzu ernst nehmen diirfen. 

C. Das Hj-Molekul 
Hier ist 


R 2 


(27) 


und der Ansatz fur E wurde ebenfalls wieder mit M y = M 2 = = 1 vor- 

genommen. Die Bedingungen an a und a' lauten hier 


(28) 

V-klm — ^Ikm — ^kml — ^mlk 

so daB sich der Ansatz fur H 3 in der folgenden Form ergibt 
p _ ttpOQ + ^lOoC^l + 7 ? 2 + R 3 ) + «llo(7? 1 i ? 2 + R 1 R 3 + R^Rfff ^lll7?i-R2-^3 

1 + a ioo(7^i + 7?2 + Rf) + a' 110 (i? 1 i? 2 + 7 ?i7?3 + R 2 R 3 ) + cc , m R l R 2 R3 

(29) 

weil 


a 100 = a 010 = a 0015 

a ioo = a oio = a ooi 5 

Die Energieflache des H 3 ist seit einiger Zeit an vielen Stellen bekannt. 
Besonders das Minimum der Gesamtenergie bei linearer und dreieckiger 
Konstellation der drei Protonen. Die geringe Anzahl der freien Parameter 
erlaubt nicht, alle beVannten Informationen zu verwenden. Neben einem 

A 

ahnlichen Vorgehen wie in (22) bis (25) wurde noch die Bindungsenergie B der 

A 

dreieckigen Raumlage beriicksichtigt, sowie der Gleichgewichtsabstand R 0 
dieses gleichseitigen Dreiecks. Im linearen Fall wurde nur die Bindungsenergie 
B im Gleichgewicht gefordert. 

Das Endresultat fiir H 3 lautet: 

_ _ -22,648 - 3,619(7?! + R 2 + R 3 ) - 5,528(7?,/?, + R t R 3 ± R 2 R 3 ) - 2,925*!^^ 

^ 1 + l,817(7?i + 7? 2 + 7? 3 ) + 7 ?i7?2 + 7?i/?3 + 7? 2 /?3 + 2,925/?i/?27?3 

(30) 

Der Vergleich mit den bisher berechneten <f-Werten nach Variationsverfahren 
(12) i, wobei in einigen Fallen aus den Rechnungen durch Interpolation die 
entsprechenden (^-Werte erhalten wurden, und den Werten, die sich nach (30) 


“110 =o- 


101 


= a, 


a no — a ioi — a 


011 


011 


(29a) 
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und (27) ergeben, ist im Falle einer gleichseitigen Dreieckskonstellation 

A 

(Abstand R) in der folgenden Tabelle angegeben worden (Tab. 2). 


TAB. 2 


A 

R 

£ 

i 

1,0 

-1,284 

-1,17 

1,5 

-1,338 

-1,33 

2,0 

-1,335 

-1,33 

3,0 

-1,298 

-1,23 

4,0 

-1,261 

-1,13 


Liegen die drei Atome auf einer Geraden, wobei das mittlere Atom den Ab¬ 
stand der beiden anderen halbiert, so ergibt sich folgender Vergleich (Tab. 3): 


TAB. 3 


R 

s 

s 

1,0 

-1,194 

-1.17 

1,5 

-1,278 

-1,28 

2,0 

-1,288 

-1,26 

3,0 

-1,261 

-1,18 

4,0 

-1,227 

-1,10 


Der Abstand benachbarter H-Atome ist mit R bezeichnet. Die Naherung ist 
in Anbetracht ihrer Einfachheit und derTatsache,daBnichtalleInformationen 
verwendet wurden, befriedigend. Die Abweichungen betragen im Mittel nur 
wenige Prozent. Eine Verbesserung ist wiederum durch Erhohung von Mj 
und durch Verwendung weiterer Informationen, sowie durch bessere Naherun- 
gen in den geringerzentrigen Systemen zu erreichen. 

Frau I. Funke danke ich herzlich fur die hier durchgefuhrten Rechnungen 
und fur die Hilfe bei der Herstellung der Programme. 

Dem Verband der Chemischen Industrie sei fur die Verfugungstellung von 
Forschungsmitteln gedankt. 
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I. Introduction 

It is well known [see, for example, Daudel (1966)] that following the 
transition-state theory [see, for example, Glasstone et al. (1941)] the rate 
constant k of a chemical reaction, 

A + B + • + D + •••, 

taking place in a certain solvent, is given by the equation: 

M\ + M\ + M\ + M\ b + M\{T) j 

XT J 

In this equation % is the Boltzmann constant; r\, the transmission coefficient; 
t, the tunnel-effect factor;/ A ,/ B ,/ Mt are the various partition functions and 
A$\, M \, M \, M \ h , Adenote respectively the contributions to the 
potential barrier of the vibrational energy, the localized bonds, the delocalized 
bonds, the interaction between nonbonded atoms, and the solvation energy. 

The effective application of this theory for conjugated organic molecules 
began in 1942 when Wheland proposed a convenient way to calculate M\ 
(Wheland, 1942) which is very often the most important term of Eq. (1). 
This term M\ is often called localization energy in the case of substitution 
reactions and para or ortholocalization energy in the case of addition reactions. 


k=ri(l+t) xl 


h / a /b" 


exp 
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Localization energies are examples of dynamic indexes and their use based 
on Eq. (1) represents the dynamic method of studying the chemical reactivity. 

There is another approach to the same problem which is based on the 
consideration of various properties associated with the reagents as: (a) the 
bond order introduced by Pauling in the valence-bond method (Pauling, 
1932) and by Coulson (1939) in the molecular-orbital theory; (b) and the 
free valence number introduced by Daudel and Pullman (1945a)* in the 
valence bond theory following an idea of Swartholm (1941) and by Coulson 
in the molecular-orbital theory (Coulson, 1946). 

Bond orders, free valence numbers, atomic charges, etc., are often called 
static indexes as they do not depend on the transition state. Their use for the 
interpretation of chemical reactions which is mainly based on chemical 
intuition (see for example Daudel and Pullman, 1945b) or perturbation theory 
(Coulson and Longuet-Higgins, 1947) corresponds to the static method which 
is also called the molecular diagram method as Daudel and Pullman (1946) 
proposed the name “molecular diagram” to design a graph representing the 
distribution of the static indexes in a given molecule. 

In the author’s opinion the theoretical background of the static method is 
less satisfactory than the basis of the dynamic method, and the first convenient 
explanation of the success of the static method was given when it was possible 
to establish relationships between static and dynamic indexes, at least for the 
ground states of organic molecules. 

Daudel et al. (1950) observed such a relation between the free valence 
numbers and the localization energies in the case of alternant hydrocarbons 
by using the valence-bond method. This relation has been extended by Roux 
(1950) in the framework of the molecular-orbital method. This last work has 
been confirmed by Burkitt et al. (1951). 

But when both the static and the dynamic methods are used to study the 
same reaction between nonexcited molecules the second one appears to be 
usually better from the practical point of view (Sung et al., 1960) as it is 
from the theoretical one. 


II. Examples of Application of the Molecular Diagram 
Method to Photochemistry 

A. Photocyclization of butadiene 

It seems that the first proposal to use the static indexes in studying a 
photochemical problem is due to Pullman and Daudel (1946). They calcu¬ 
lated the bond orders and the free valence numbers for the ground state and 
the first excited state of butadiene using the valence bond method in which, 

* See also Daudel et al. (1946). 
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as is well known, the wave function is expanded on Slater determinants. The 
molecular diagrams of Fig. 1 symbolize their results. 



Ground state Excited state 

In the ground state the two bonds 1—2 and 3—4 have the highest bond 
orders. The free valence numbers in 1 and 4 are yet important but they 
become very large in the first excited state in which the central bond 2—3 
becomes similar to a double bond, the bond orders in 1—2 and 3—4 be¬ 
coming on the contrary very small. 

We can describe these results in other words. In the valence-bond method 
the wave function describing a state of butadiene is written as 

T / = a'F I + M / II 

if represents the Kekule formula (I) and the Dewar formula (II). 


W 

(I) 


(ID 



4- hv -*■ 


(III) 


In the ground state a is large and b small. The ground state is represented 
conveniently by the Kekule formula. In the excited state a is small and b 
large. The excited state is conveniently represented by the Dewar formula. 
It could be anticipated that under the effect of light the butadiene could be 
transformed into cyclobutene. In 1946, there was no experimental evidence for 
such a phenomenon. But recently Srinivasan (1963) has found that when 
butadiene is irradiated in dilute ether solution, cyclobutene (III) is formed. 


B. Photohydrolyze of nitrophenyl ethers 

The distribution of the electronic charges for various electronic states of 
nitrobenzene was calculated by Fernandez-Alonso (1951) during his stay in 
our laboratory. 



Ground state Excited state 

Figure 2 makes it possible to compare these distributions for the ground 
state and for one of the first excited states which, as stated by the author, 
could be reached under usual photochemical conditions. A striking difference 
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appears between the two states. On the ground state the nitro group with¬ 
draws electrons from the ortho and para positions and does not alter the 
electronic charge of the meta position. 

In the excited state, on the contrary, the nitro group withdraws electrons 
mainly from the meta position, the para position being not significantly 
perturbated. 

Five years later, Havinga et al. (1956) studied the photochemical hydrolysis 
of the isomeric nitrophenyl dihydrogen phosphate and also the bisulfate 
esters. They observed that the process is most efficient for the meta isomers as 
if on the excited state of these molecules the nitro group were able to with¬ 
draw electrons from the phenolic-phosphate oxygen atom, thus facilitating 
heterolytic fission. 

More recently Zimmerman (1963) studied the photochemical behavior of 
the trityl ethers of m-nitrophenol and p-nitrophenol in aqueous dioxane. 
They observed that in the absence of light, in the dark at 25°, the meta com¬ 
pound is stable, the para compound is slowly hydrolyzed. Under the effect of 
light, on the contrary, the quantum efficiency is much greater for the meta 
compound (0.062) than for the para derivative (0.006). Again an interpreta¬ 
tion is given if we admit, as suggested by the Fernandez-Alonso calculation, 
that on some excited states of these molecules the nitro group withdraws 
electrons from the meta position. 

Zimmerman and Somasekhara (1963) have calculated the distribution of 
the electronic charges for the ground states and the first excited states of the 
considered trityl ethers. 

Table 1 contains the electronic charges obtained for the oxygen of the 
OC0 3 group. 


TABLE 1 


Ground state 

Excited state 

Meta derivative 

1.764 

1.279 

Para derivative 

1.703 

1.307 


It is clear that, as for the reactivity, the order of the electronic charges on 
the ground state is the reverse of the corresponding order for the ground state. 

C. Phototransformation of a base in an acid 

Coulson and Jacobs (1949) had studied, theoretically, charge migration in 
aniline under the effect of irradiation. They observed that the electronic charge 
of nitrogen is smaller on the first singlet excited state than on the ground 
state, that is to say, it should be less basic. Forster (1949a) has effectively 
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observed that if a base, such as 3-aminopyrene (which must have a similar 
behavior), is irradiated by normal light, the excited molecules have acidic 
properties. More precisely, Forster studied absorption and fluorescence 
spectra as a function of the pH of the solution containing the amino com¬ 
pound. Obviously, the absorption spectrum gives information about the 
ground state of the molecule. The fluorescence spectrum is related with the 
electronic-excited states. Up to pH 2 the absorption spectra are essentially 
those of the ArNHj ions, whereas the fluorescence spectra correspond to 
ArNH 2 . This shows that the molecules in their excited states have less 
tendency to add a proton than the ground-state molecules. Furthermore, near 
pH 12 some new bands appear in the fluorescence spectra which can be 
assigned to the ArNH - ions. These ions are probably the result of a reaction 
such as the following: 


ArNH 2 ^ArNH" + H+, 

where the amino compound acts as an acid. No such bands appear in the 
absorption spectra. 

To interpret this result, Sandorfy (1951) has calculated the distribution of 
the electronic charges in an aromatic amine by the molecular-orbital method 
taking account of both the n and the a orbitals. He found that the nitrogen 
which is negative in the ground state becomes positive in the first electronic 
excited state which will explain why the molecule becomes an acid. 

Other cases have been described which follow the same procedure. For 
example, Jaffe et al. (1964) have observed that the p K of some excited states 
of azobenzene follows the charge of the nitrogen atom and that in the case 
of azoxybenzene there is a satisfactory relation between the charge of the 
oxygen atom and the p K. 

D. Other examples 

Crawford and Coulson (1948) studied the photodimerization of acenaphty- 
lene following a similar method. Later it was observed (Sandorfy, 1950; 
Buu-Hoi et al., 1951) that all the free valence numbers of molecules like 
anthracene, naphtacene on their first electronic excited states are greater than 
those of the corresponding ground states: the greatest free valences remaining 
those of the meso carbon atoms. This increase of the free valence numbers 
could play a role during the photodimerization and during the photooxadi- 
tion of the considered molecule. On the contrary, the bond order of the 
central bond of stilbene on its first excited state is smaller than those of the 
ground state (Buu-Hoi et al., 1951). This phenomenon could be an explana¬ 
tion of the trans -> cis photoisomerization of stilbene. In following the same 
procedure Masse (1954) and Bloch-Chaude and Masse (1955) try to explain 
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the photochromic properties of some derivatives of pyranospirane. We must 
also point out the work of Mantione and Pullman (1964) on the photodimer¬ 
ization of thymine. 

Finally, a new kind of static index has been recently introduced. Woodward 
and Hoffmann (1965) have studied some reactions resulting in the formation 
of a single bond between the terminals of a linear system containing ^-electrons. 
The authors suggested that the steric course of the reaction is determined by 
the symmetry of the highest occupied molecular orbital of the open-chain 
partner and should therefore differ for different electronic states. Longuet- 
Higgins and Abrahamson (1965) have discussed along these lines the con¬ 
version of cyclobutene to butadiene. 

III. Discussion 

Now a question arises: Is the explanation of the brilliant success of the 
molecular diagram method in interpreting or even predicting photochemical 
reactivity the same as for molecules in their ground state ? 

To answer this question we must consider separately the case of the p K 
because Forster (1949b) has shown that the new acid-base equilibrium in the 
excited state is often established during the lifetime of that state. 

Therefore we are often in presence of a real equilibrium, and the success 
of the molecular diagram method lies probably in the existence of a relation 
between the charges and the term A«f d representing the difference between 
the energy of the delocalized bond of the ion and that of the initial molecule. 
This point needs further investigation. It is, however, interesting to recall that 
the rate of proton transfer from a solvent such as water to an acid is nearly 
constant (Eigen et al., 1964). The p K gives, in a sense, a measure of the rate 
of proton transfer from the acid to the solvent. 

In conclusion, the case of the p K seems to be rather clear, and Table 2 
shows the variation of some p K with the electronic state of the molecule 
(Jackson and Porter, 1961). 


TABLE 2 



pK (ground state) 

p K (first singlet 
excited state) 

p K (first triplet 
state) 

/3-naphtol 

9.46 

2.8 

8.1 

/3-naphthylamine 

4.1 

-2 

3.3 

Acridine 

5.5 

10.6 

5.6 


Obviously the p K of the triplet state is very similar to the p K of the ground 
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state, but the p/f of the first singlet excited state is much smaller than the 
latter. Murrell (1964) has given an explanation of this phenomenon observing 
that the orbitals which are responsible for the charge transfer lie much more 
above the triplet state energy than above the first excited singlet. Linnett 
(1964) has offered another explanation based on the consideration of electron 
correlation. 

The case of the other reactions is completely different as we do not know 
if it is possible to use the transition-state theory, which is open to criticism 
from the theoretical point of view (Laidler, 1955). Furthermore, usually no 
temperature coefficients are found, and when both the static method and the 
dynamic one are compared with experimental results (Havinga, 1966) the 
static method appears to be the best regardless of what happens in the case 
of molecules in their ground state. 

In the case of excited molecules the success of the molecular diagram 
method lies perhaps in the fact that the static indexes remain in relation to 
the shape of the potential surface. This certainly plays an important role in 
the determination of the reaction rates even if the transition-state theory 
does not apply. This statement certainly needs further investigations. 
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Electron Transfer and Atomic Magnetic Moments 
in the Ordered Intermetallic Compound AlFe 3 
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During the past four decades a vigorous effort has been made by many 
investigators, of whom one of the foremost has been Professor John C. Slater, 
to develop a satisfactory and comprehensive quantum mechanical theory of 
the electronic structure of metals and alloys. There is general agreement that 
the Schrodinger equation provides a correct basis for such a theory, but the 
mathematical difficulties in its application to crystals are so great that even 
now, forty years after the discovery of quantum mechanics, the theory remains 
incomplete. 

Throughout this period I have striven to formulate a semiempirical theory 
of metals and alloys, based upon a set of postulates suggested by quantum 
mechanical arguments and supported by observed properties, especially 
magnetic properties (1) and interatomic distances (2). This theory, the reson- 
ating-valence-bond theory (3,4), permits the discussion in a moderately 
satisfactory way of some properties of metals and alloys that have not yet been 
incorporated into the rigorous quantum mechanical theory. As an example 
there is given in the following paragraphs a discussion of the observed inter¬ 
atomic distances and atomic magnetic moments in the ordered intermetallic 
compound AlFe 3 . 

The compound AlFe 3 has the LI2 structure, which is closely related to the 
body-centered cubic structure of alpha iron (5). The atomic coordinates are 
8 Fe‘ 4 at 000, OV 2 , 40V, 2 i0, \\\, 400, 040,004;5 4 A1 at 444’ 44 ?’ 444’ ???> &nd 
4 Fe B at Ub 41 b 44 b The lattice constant has the value a 0 = 5.794 A. 
The structure can be described as having Fe x atoms at the corners of small 
cubes (edge one-half that of the unit cube), with A1 atoms and Fe B atoms 
alternating in their centers. Each A1 atom is in contact with 8 Fe x , at the 
distance 2.509 A, each Fe x with 4 A1 and 4Fe B , and each Fe B with 8 FeA 
There are also six neighbors about each atom at the distance 2.897 A (a 0 /2). 
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The observed bond length for alpha iron is 2.482 A, and that for aluminum, 
changed from ligancy 12 to the body-centered structure,* is 2.792 A. The 
weighted average of these, 2.560 A, is expected for AlFe 3 from Vegard’s rule 
of additivity, which has been found to agree with observation for many alloys. 
The considerable deviation of this value from the observed value, 2.509 A, 
requires explanation. 

Moreover, neutron-diffraction studies (6) have led to the assignment of the 
magnetic moment value 1.50 Bohr magnetons to the Fe^ 4 atoms and 2.18 to 
the Fe® atoms (the observed value for alpha iron being 2.22); these values 
also need a theoretical explanation. 

The explanation is given by the theory of electron transfer in alloys (7; 
4 , p. 431), which is a part of the resonating-valence-bond theory. Metal 
atoms are divided into three classes: hypoelectronic atoms, such Na, Mg, Al, 
K, Ca, etc. (elements at the left of the periodic table); buffer atoms, such as 
the atoms of the iron group and other transition groups; and hyperelectronic 
atoms, such as Cu, Zn, etc. The valence of a hypoelectronic atom is limited 
by the number of electrons in its valence shell; it has an excess of orbitals. 
A hypoelectronic atom can increase its valence by accepting electrons from 
other atoms; thus the aluminum atom, which usually has valence 3, can achieve 
valence 4 by accepting one electron. Hyperelectronic atoms, which have an 
excess of electrons over orbitals in the valence shell, can increase their metallic 
valence by donating electrons. Buffer atoms can donate or accept electrons 
without change in valence. We expect accordingly that alloys containing 
hypoelectronic atoms and either hyperelectronic atoms or buffer atoms would 
tend to undergo electron transfer from the hyperelectronic or buffer atoms to 
the hypoelectronic atoms, thus increasing the number of valence bonds per 
atom and causing increased stability of the alloy. In accordance with this 
argument, it might be expected that the aluminum atom would accept one 
electron from the iron atoms in AlFe 3 , increasing its valence to 4. Aluminum 
and iron have nearly the same electronegativity [1.5 and 1.8, respectively 
( 4 , p. 93)], so that little transfer of electric charge as a consequence of partial 
ionic character of bonds is expected; hence the charge of an aluminum atom 
to which an electron has been transferred is close to — 1, a limit of the range 
( — 1 to +1) allowed by the electroneutrality principle (8; 4, p. 172). 

Stabilization by Coulomb interaction requires that the positive charges 
balancing the negative charge of the aluminum atom be at the minimum 
distance; hence we locate these charges on the Fe x atoms (at distance 2.509 
A), and conclude that half an electron has been removed from each Fe^ 4 
atom, by transfer to Al. 

* The change is made with use of the equation D(n) =£>(!) — 0.600 log n, where D(n) 
is bond distance for bond number n (Ref. 2; Ref. 4, p. 400), with consideration of the 
eight nearest and six next-nearest neighbors in the body-centered structure. 
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Electron transfer to or from an atom is expected to change its metallic 
radius. It is difficult to make a reliable estimate of the change in metallic 
radius resulting from electron transfer; however, it is found that as a rough 
approximation the value of the metallic radius may be taken as that of the 
atom with atomic number equal to the electron number of the atom involved 
in electron transfer. Thus the metallic radius of aluminum with an added 
electron may be taken to be that of the next element in the periodic table, 
silicon, which for the body-centered structure* corresponds to the bond 
length 2.578 A. The effective diameter of an atom Fe^ that has lost half an 
electron is found by interpolating between the values for manganese (valence 
6, body-centered structure) and iron; it is 2.486 A. Accordingly, the average 
bond length expected for AlFe 3 , with electron transfer as described above, 
is 2.508 A, in excellent agreement with the observed value. 

Electron transfer from iron to aluminum in this alloy also decreases the 
strain that would otherwise result from the insertion of the large aluminum 
atoms, 13 % larger than the iron atoms, into the iron lattice. After electron 
transfer the aluminum atoms are only 3.7% larger than the iron atoms. 

The observed saturation magnetic moment per iron atom in alpha iron is 
2.22 Bohr magnetons. This is, according to the Zener theory of induced 
magnetic polarization of conduction electrons (9), to be divided into the part 
2.00 attributed to the atomic electrons of the iron atom and the part 0.22 
attributed to the induced moment of the conduction electrons (10). The 
atoms Fe B in AlFe 3 would be expected to have the same atomic moment, 
2.00, and the atoms Fe A , which have lost half an electron, would be expected 
to have a value 0.5 less, 1.5 Bohr magnetons, in each case increased by the 
induced moment of the conduction electrons, which is about 0.14 (the value 
for alpha iron multiplied by the ratio of the average atomic moments for 
AlFe 3 and alpha iron). Accordingly, the conclusion is reached that the mag¬ 
netic moments should be about 2.14 for Fe B , 1.64 for Fe^, and 0.14 for A1 
(or probably slightly larger for Fe B and smaller for Al), in agreement with 
the observed values, 2.18 for Fe B , 1.50 for Fe^, and zero for Al, all +0.10. 

We can in the same way predict values for the magnetic moments of 
atoms in other alloys. For example, in the ordered alloy A1 Co 3 , with the same 
structure as AlFe 3 , we predict for Co B the value 1.72 (the same as for the 
element) and for Co' 4 the larger value 2.25 Bohr magnetons. The resonating 
valence bond theory of metals and alloys leads clearly to the prediction that a 
cobalt atom on losing half an electron would undergo an increase in its 
magnetic moment, rather than a decrease, as found for the iron atom. 

A neutron diffraction study has been reported (11,12) for the magnetic 


* The observed value of Z>(1) for the Si—Si bond is 2.353 A. Change to the value for the 
body-centered structure is made as described in Footnote on p. 304. 
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form factors of the iron atoms in AlFe 3 . A difference has been found between 
Fe^ and Fe B . In the theoretical discussion of this difference, it would, I think, 
be wise to take into consideration the transfer of electrons from the Fe x 
atoms to aluminum atoms, as discussed above. 


REFERENCES 

1. L. Pauling, Phys. Rev. 54, 899 (1938). 

2. L. Pauling, J. Am. Chem. Soc. 69, 542 (1947). 

3. L. Pauling, Proc. Roy Soc. (London) A196, 343 (1949). 

4. L. Pauling, “The Nature of the Chemical Bond,” 3rd ed.. Chap. 11. Cornell Univ. 
Press, Ithaca, New York. 1960. 

5. A. J. Bradley and A. H. Jay, Proc. Roy. Soc. {London) A136, 201 (1932). 

6. R. Nathans, M. T. Pigott, and C. G. Shull, Proc. Conf. Magnetism Magnetic Mat., 
Boston, 1956 p. 242. Am. Inst. Electrical Engineers, New York, 1957. 

7. L. Pauling, Proc. Natl. Acad. Sci. U.S. 36, 533 (1950). 

8. L. Pauling, J. Chem. Soc.(London) 1948, 1461. 

9. C. Zener, Phys. Rev. 81, 440 (1951). 

10. L. Pauling, Proc. Natl. Acad. Sci. U.S. 39, 551 (1953). 

11. S. J. Pickart and R. Nathans, Phys. Rev. 123, 1163 (1961). 

12. C. G. Shull, In “Electronic Structure and Alloy Chemistry oftheTransitionElements,” 
p. 69. Wiley (Interscience), New York, 1963. 


The One-Electron Approximation 

Epistemological, Spectroscopic, and Chemical Comments 


CHR. KLIXBOLL J0RGENSEN 

CYANAMID EUROPEAN RESEARCH INSTITUTE, COLOGNY (GENEVA), SWITZERLAND 


I. The Point of View of Atomic Spectroscopy . 307 

II. MO Theory of Molecules and Inorganic Chromophores . 311 

III. The Unexplained Success of One-Particle Classifications .314 

References . 316 


I. The Point of View of Atomic Spectroscopy 

Slater (7) wrote the paper on many-electron systems on which the treatise 
by Condon and Shortley (2) is based. Slater proposed that parameters of 
interelectronic repulsion express the main effects of separation into different 
S'jL-multiplet terms of configurations containing at least one partly filled 
shell. It had previously been realized (5) how/and parity are the exact quantum 
numbers of energy levels in spherical symmetry, and that S and L are also 
fairly good quantum numbers if Russell and Saunders’ coupling scheme is a 
good approximation. However, in the minds of many physicists the origins 
of the energy differences were some rather mysterious spin-spin and orbit- 
orbit coupling forces. Slater’s theory suggested that the only two-electron 
operator of practical importance is the Coulombic repulsion e 2 /r 12 and that, 
to a first approximation, the different energy levels of a given configuration 
have the same electronic density in our three-dimensional space, but different 
values of the average reciprocal interelectronic distance <l/r 12 >. The ground 
term of a given configuration (usually, according to Hund’s rule, the maximum 
L compatible with the maximum S) has the lowest value of this expression. 

Charlotte Moore-Sitterly’s “Atomic Energy Levels” clearly demonstrate 
a great success in the application of these ideas to the classification of the 
first ten or first hundred energy levels of a given gaseous atom or monatomic 
ion. Frequently, all the levels expected of complicated configurations have 
been identified, and there is hardly any case known where “ superfluous ” low- 
lying energy levels resist this classification. 

At the time when numerous atomic spectroscopists lived (and one may 
deplore that artificial radioactivity and nuclear transmutation depopulated 
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this class nearly entirely) it was the general feeling that intermixing of con¬ 
figurations is only important when two configurations of the same parity 
overlap or at least are separated by a distance smaller than the width of the 
individual configurations, the nondiagonal elements of the interelectronic 
repulsion normally being smaller than the energy differences produced by 
differences of the diagonal elements in a given configuration. A typical case of 
coinciding configurations is [Ar]3d q 4s 2 and [Ar]3d q + 1 4s in neutral atoms of 
the first transition group (and the analogous configurations in the 4d and 5d 
groups)whereas the ions M + + or M 3+ have [Ar]3d q well below [Ar]3d q_1 4s. 
In such cases, it was previously assumed that the well-separated configura¬ 
tions do not interact appreciably. 

In a certain sense, this was an unduly optimistic attitude. The ground state 
l S of the helium atom has a wave function (4) which, in the squared amplitudes, 
consists of 99.19% of the conventional configuration Is 2 . The two next- 
largest contributions, 0.38%(oos) 2 and 0.40%(oop) 2 come from orbitals 
essentially belonging to the continuum. Though their diagonal energy above 
the ground state is some one to three times the ionization energy, and hence 
proportional to Z 2 , Z, being some effective charge of the type (Z — 0.31), 
the nondiagonal elements are proportional to Z., and the second-order 
perturbation energy hence essentially invariant as function of Z„. Thus, one 
can at most hope for wave functions of many-electron systems having the 
conventional configuration C = a\a\a\ ••• a 2 (containing 2 n electrons) to 
exhibit the approximate form 

•■■) i,2 T(C) + t I A WCa^a?) + - (1) 

k r k-1 r 

where two-electron substitutions aj -* aj have been considered only. The 
effects of one-electron substitutions are known nearly to vanish for self- 
consistent orbitals <$> k . The groundstate *5 of the beryllium atom (5) agrees 
qualitatively with Eq. (1), the important substitutions of ls 2 2s 2 being ls 2 2p 2 , 
(cos) 2 2s 2 and (cop) 2 2s 2 . The first of these substituted configurations involve 
orbitals with negative energy known from discrete, excited states, and the 
second-order perturbation argument suggests a stabilization of the ground 
state proportional to Z. rather than being invariant in an isoelectronic series. 
This situation is called near-degeneracy of orbitals. 

The theory of many-electron atoms was refined by Racah for d-shells (6) 
and f-shells (7) using sophisticated group theoretical techniques (8) and in a 
more modest way analyzing the known energy expressions (9, 10). Thus, 
one can show that the baricenter of all states having a definite value of S 
belonging to a given configuration containing one partly filled shell l q has the 
interelectronic repulsion energy 
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the average value and the spin-pairing energy parameter D having the 
expressions in terms of Slater’s and Racah’s parameters: 
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There does not exist a general parameter dependent on L, but it is striking (9) 
how frequently the multiplet terms have energies being linear functions of 
L(L + 1 ). 

The deviations of the observed levels from the predictions (considering the 
F k integrals as parameters to be determined from experience) can frequently 
be ascribed to effects of near-degeneracy. Racah and Trees introduced / 
correction terms besides the (/ + 1) different F k integrals; a closer analysis 
(11) shows that effectively, the energies of the (21 + 1) multiplet terms of 
l 2 CS, 3 P, 1 D, 3 F, ...) are the intrinsic variables of such a description which is 
an application of fractional parentage building / q wave functions from those 
of l 2 . Quite unexpectedly, even the configurations [Ne]3s 2 3p 4 3d q+2 have a 
perceptible influence of the near-degeneracy type on the conventional con¬ 
figuration [Ne]3s 2 3p 6 3d q of first transition-group ions (12). It would be 
possible (10) to extend the concept of seniority number v to the idea of 
uncoupling of /-values. Terms with v < q have wave functions which could be 
constructed by adding spherically symmetric contributions ‘F to a lower 
number of electrons with positive /. Hence, the terms with decreased v allow 
the mixing of / q_2 s 2 and / q . The similar condition of admixture of / q-2 p 2 and 
/ q would be the separability of 1 S, 3 P or l D components for two of the elec¬ 
trons, furnishing a mechanism of slight stabilization of these three terms but 
not 3 Fand 1 G of d 2 , one of the goals of the Racah-Trees corrections. On the 
other hand, the uncoupling of / q to / q-2 (/') 2 with l' > l disturbs all term dis¬ 
tances, but we do not yet know whether there would be a systematic depend¬ 
ence on S and L. Racah (13) seems to prefer “ model interactions ” of mutual 
dipolar polarization etc. of the orbitals, producing perturbation energies 
proportional to the Racah-Trees corrections. 

Watson (14,15) calculated analytical Hartree-Fock functions for 3d group 
ions and Freeman and Watson (16) for 4f group ions. The deviations of the 
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integrals of interelectronic repulsion F k calculated for the 3d group ions with 
ionic charge z relative to the empirical values of these parameters show a 
number of striking regularities (9): to the first approximation, they are all too 
large by a factor (z + 3)/(z + 2). Such a behavior of an isoelectronic series 
indicates predominant effects of continuum orbitals because the F k integrals 
themselves are roughly proportional to (z + 2), i.e., all term distances are 
decreased in a way not much dependent of z in an isoelectronic series. How¬ 
ever, these decreases are dependent on S and L (corresponding to the Racah- 
Trees corrections) and 2 P of 3d 3 has lower energy than 2 FI though the two 
terms are degenerate in Slater’s theory. 

Another interesting result is that term distances such as 3 F -> 3 P or 4 F -+ 4 P 
between terms having the maximum value of S are decreased to the same 
extent as the other energy differences. Some authors thought that electrons 
with parallel spin would produce much weaker correlation effects than 
electrons with opposite sign because the antisymmetrization conditions posed 
on 'F prevent very large values of l/r 12 in the former case. However, this 
argument is doubtful because of the long-range nature of the Coulomb 
interactions (10) and has been refuted quantitatively for the helium atom 
where the correlation effects have a spatial extent comparable to the average 
diameter of the electronic cloud (17). 

We must admit at present that we have no certain idea about the actual 
form of Eq. (1) for transition-group ions though configurations such as 
3d q-2 (oof) 2 and 3d q-2 (oog) 2 must be responsible for a large part of the 
difference between Hartree-Fock and empirical F k parameters. A conservative 
estimate would make Eq. (1) divergent for at least one-hundred electrons 
because four- and six-electron substitutions then are more important than 
two-electron substitutions. However, this divergence may very well occur at a 
much lower number of electrons. On the other hand, the classification of 
energy levels according to electron configurations is not less successful for 
radium and actinium than for lighter atoms. 

It may be quite legitimate to ask whether the conventional 'F is an entirely 
satisfactory description of many-electron systems. A theorem first shown by 
Dirac has recently been discussed by Lowdin (4): if only two-electron inter¬ 
actions occur, the second-order density matrix with two spin and six spatial 
variables is sufficient for obtaining all observable quantities. There is much 
evidence that the Hartree-Fock T gives very good agreement with the first- 
order density matrix in our three-dimensional space. After all, the concord 
with experimental data is the supreme criterion for quantum mechanics, and 
the results of x-ray diffraction by electronic density or neutron diffraction by 
uncompensated spin density are compatible with the calculations though, 
unfortunately, the experimental precision which can be obtained is not very 
high. On the other hand, the second-order density matrix of the Hartree- 
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Fock ¥ exaggerates the average value <1 //- 12 >. In other words, there is an 
“internal polarization” in the second-order density matrix without great 
effect on the first-order electronic density. This is the physical significance of 
the two-electron substitutions with continuum orbitals having a radial 
extension comparable to the filled orbitals. In molecular calculations, Julg 
(18) and other authors have proposed, for empirical reasons, to decrease all 
parameters of interelectronic repulsion roughly to the same extent as observed 
in isolated atoms. This is also the basis for Moffitt’s ideas (19) about “atoms 
in molecules.” If the Z,-dependent effects of near-degeneracy of orbitals are 
neglected, it seems to be a surprisingly good approximation (20) to assume 
that the inter-n-shell interelectronic repulsion F k (nl, n'l') is not affected, 
that F°(nl, nl) (or perhaps rather A l of Eq. (1)) and F°(nl, nl') are decreased 
from their Hartree-Fock value as if the effective charge Z» were 0.0747 unit 
lower, whereas the parameters separating the terms of a given configuration 
are decreased as if Z. was a whole unit smaller. This corresponds to two 
different “effective dielectric constants” for the internal polarization but 
has not yet found any solid theoretical justification. Flowever, the regularities 
observed suggest that a correlated second-order density matrix is a more 
appropriate description than configuration interaction. 

II. MO Theory of Molecules and Inorganic Chromophores 

Slater (21) wrote a highly fascinating technical report which has only been 
published in part (22,23; Chapter 4). In particular, the analysis of H 2 at 
varying internuclear distance shows that V. B. treatments meet intrinsic 
difficulties of rather unexpected nature. If the participating atomic orbitals 
are properly orthogonalized, no chemical bonding can be predicted by the 
diagonal element of one structure only. On the other hand, it had been known 
since Vol. 1 of The Journal of Chemical Physics in 1933 that the MO 
configuration a] is an appropriate description of the ground state ‘Z/ of the 
hydrogen molecule only if the internuclear distance is not considerably 
longer than at the minimum of the potential curve. The exact T at this 
minimum can be expanded (24) in configurations of natural spin orbitals, the 
squares of the amplitude being closely similar to the three most important 
contributions to the He ground state, viz. 98.22% a g , 0.99% a u , 0.42 / 0 
n 2 u and 0.30% of a second a g . It is instructive to compare with the behavior 
of two isolated hydrogen atoms, i.e., a hydrogen molecule stretched to a very 
large internuclear distance, whose 'F is an exact mixture of 50% a g and 50 / 0 
al. The diagonal elements of energy of the two latter configurations are 
considerably higher than for two isolated H atoms. In other words, H 2 is a 
typical example of how MO theory, using qualitative arguments related 
to well-defined configurations, is applicable only in situations of relevant 
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symmetry (25). The symmetry group of the nuclei in a molecule is not necessarily 
the best to use in MO theory; the nondiagonal elements of the two-electron 
operator are frequently much larger than the one-electron energy differences, 
if the orbitals are adapted to irrelevant symmetry components. This point 
clarifies many problems recently discussed in literature, and removes the most 
serious criticism one can make of molecular orbital and energy band theory. 

Ligand field theory (26-28) is essentially an application of MO theory to 
the relevant symmetry of inorganic chromophores (29) MX N where the 
central atom M connected to N ligand atoms X contains a partly filled d or f 
shell. The influence of the ligand atoms cannot be represented by an external 
electrostatic field (except the group-theoretical properties (30)) but is represen¬ 
ted in the angular overlap model (31-33) in a way somewhat comparable to 
the Hiickel model for organic molecules, but utilizing the properties of the 
hydrogenic angular /-functions of the central atom. The diagonal elements of 
these highly heteronuclear molecules are more difficult to evaluate; it is 
imperative to take Madelung-energy into account (10,34). 

For our purpose, the important point of view is that the absorption spectra 
frequently allow the determination of the preponderant configuration (35) 
in the sense of classifying the low-lying energy levels of the chromophore. 
Thus, there is no doubt that the complicated distribution of absorption bands 
of numerous V(II), Cr(III), Mn(IV), Mo(III), Tc(IV), Re(IV), and Ir(VI) 
complexes (36) containing octahedral chromophores MX 6 indicate the pre¬ 
ponderant configuration d 3 . It is actually the main reason why we can assign 
oxidation states written with Roman numerals, vanadium(II), chromium(III), 
etc. because the fractional charge on the central atom is certainly not as 
high as +2, +3, etc. There exist ligands, such as NO and certain macrocyclic 
molecules, which do not allow a preponderant configuration with an integral 
number of electrons in the partly filled shell to be defined (37). However, 
most common ligands are “innocent,” i.e., they allow the preponderant 
configuration to be detected. 

Lanthanide compounds are in a completely extreme category. The energy 
levels fall in narrow groups each corresponding to a definite 7-level of 4f q 
in spherical symmetry (8,10,38,39). The interelectronic repulsion parameters 
which are some 30% smaller for the gaseous ions than calculated from the 
Hartree-Fock functions (16) are some 1-5% smaller in compounds than in 
gaseous ions (40-42) (the evidence for this statement was somewhat indirect 
until Sugar (43) recently found twelve of the thirteen 7-levels of [Xe]4f 2 in 
gaseous Pr 3+ ). In metallic alloys or in semiconductors with low energy gap, 
the magnetic moments (44) and other physical properties indicate that the 
number of electrons in the partly filled 4f shell normally is an integer. The 
metallicity seems frequently to be connected with a calculated lower energy 
(45) of the lowest term of 4f q_1 5d than of 4f q . Professor K. A. Jensen, 
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Copenhagen, has proposed to put oxidation states determined in this way 
from preponderant configurations in brackets, 4f 7 for instance corresponding 
to Eu[II], Gd[III], or Tb[IV], in order to distinguish this concept from the 
normal oxidation numbers. The physical origin of the integral values of q 
normally found may reside in the coefficient q(q — l)/2 to A x in Eq. (2). The 
partly filled 4f shell has an unusually small average radius, and hence A 1 
is larger than in the four other transition groups, also contributing to the 
nearly invariant oxidation state M(III) of the 4f group in contrast to the 5f 
group (39). If the 4f electrons were highly delocalized, the classical coefficient 
q 2 /2 would obtain to A x . The difference, — qAi/2, explains the tendency to¬ 
wards localization of a definite number of electrons in partly filled shells with 
large A x . 

Bloch’s theory of energy bands is the logical extension of MO theory to 
infinite crystals. However, it has to be looked upon with great circumspection 
because of the disastrous effects of nondiagonal elements of the two-electron 
operator strongly mixing configurations adapted to irrelevant symmetry 
components (25) as we saw above in the case of two hydrogen atoms at large 
distance. The partly filled energy bands of normal lanthanide compounds, 
such as PrCl 3 or Nd 2 0 3 , certainly do not produce metallic properties. The 
excuse in the energy-band jargon is “ hardly any curvature of the energy bands 
and vanishing mobility.” One may not be convinced that this statement 
indicates the profound reasons for nonmetallicity. The central atoms have a 
much lower electron affinity than ionization energy, differing approximately 
to the extent of A x about 18 eV, and consequently, it requires a great deal of 
energy to transport an electron by the “hopping process” in such a lattice 
(46). Many 3d group compounds, such as NiO, are in the same situation. For 
group-theoretical reasons, the paramagnetic form of this compound, which 
has cubic symmetry, ought to have a half-filled energy band though it is 
nonmetallic in the same way as the lanthanide salts previously discussed were. 
It is also impossible that the antiferromagnetic form that exists below the 
Curie temperature has the upper energy bands of opposite spin separated 
sufficiently to explain its behavior on a one-electron basis. Thus, the absorp¬ 
tion spectra of NiO and the isomorphous diluted crystals Ni^Mgj.^O are all 
essentially similar (47) corresponding to the octahedral chromophore Ni(II)0 6 
and the x-ray absorption corresponding to 2p 6 3d 8 -* 2p 5 3d 9 in NiO shows 
only the fine-structure expected from the two- and one-electron parameters 
present in Slater’s theory for the isolated central atom Ni(II) (48). 

It has frequently been suggested (44,49) that metallic bonding occurs only 
for internuclear distances below a sharply defined threshold. In certain cases, 
such as NdjSm^Se, the limit as function of composition x seems very 
narrow (50). It must not be ignored that the periodicity of ideal lattices is a 
mathematically convenient assumption in the energy-band theory rather than 
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an absolute condition for metallicity. After all, liquid metals do exist. In the 
nonmetallic systems, chromophores can be recognized with remarkable 
frequency. Thus, the absorption spectra of glasses (51) and molten salts (52) 
containing transition-group ions are very similar to analogous solutions and 
crystalline solids. 

In the sense of determining preponderant configurations, there is no doubt 
about the frequent, though not universal, individuality of chromophores 
MX n . The possibility of an individuality for atoms M and X in compounds 
is a much more intricate question. Slater (53) discussed Bragg’s old idea that 
atoms approximately have a constant radius independent of the nature of the 
chemical bonding. It is indeed true that the observable distance M-X fre¬ 
quently is divided with 0.7 A larger radius of M and 0.7 A smaller radius 
of X assuming “ covalent radii ” than assuming “ ionic radii. ” As Slater him¬ 
self admits, the main difficulty is that typically ionic compounds such as CsF 
has a shorter Cs-Cs distance than in metallic Cs, but shows no physical 
consequences of intermetallic bonding. There is an absolute sense in which 
atoms of metallic elements contract when forming fairly electrovalent 
compounds. It is not a sufficient argument for VO being an interstitial com¬ 
pound that the V-V distance is roughly the same as in vanadium metal. 

X-ray diffraction results agree that the inner shells of atoms in compounds 
fill only a tiny proportion of the total volume and have radial distributions 
compatible with Hartree-Fock calculations. The low electronic density in 
the space between the atomic cores is the main subject for chemical discus¬ 
sions. At the present, the most prominent evidence for the expansion and 
delocalization of the partly filled shells in transition-group compounds is the 
nephelauxetic effect (54,55) that the parameters of interelectronic repulsion 
are decreased in compounds relative to the corresponding gaseous ions 
M +z . This evidence would be somewhat ambiguous, because of the dis¬ 
crepancy between Hartree-Fock and empirical F k integrals, if it were not for 
the extremely regular variation in the d groups as a function of the central 
atom and of the ligands. In general, oxidizing central atoms (such as Fe(III), 
Co(III), and all M(IV)) and reducing ligands (such as Br~, I - and S -- ) show 
a much more pronounced nephelauxetic effect than the opposite extreme 
represented by MnF 2 and KMnF 3 (both containing the chromophore 
Mn(II)F 6 ) and Mn(H 2 0) 6 ++ . This agrees with the qualitative expectation 
of MO theory; that is, the delocalization of the (antibonding) central atom 
orbitals is largest when the difference between the diagonal elements of 
energy H M and H x is smallest. 

III. The Unexplained Success of One-Particle Classifications 

Perhaps nowhere is the astonishing expedience of one-particle models 
as spectacular as in the nuclear-shell quantum numbers. It is absolutely 
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unacceptable to use Hartree and Fock’s arguments in nuclei, and yet, the 
classification of the lowest energy levels works. 

The next-most-surprising case is the inorganic chromophores MX N . Here, 
the configurations of molecular orbitals are adapted to the relevant symmetry 
classifying the lower energy levels in a striking way. It is worth emphasizing 
the evidence for the excitation or ionization of inner shells which produce 
apparently discrete (though auto-ionizing) levels at far higher energy than 
the first ionization energy. Thus, the absorption spectra in the far ultraviolet 
region of gaseous Cs atoms show excitations of the 5p shell (5(5); the 3d shell 
of Kr (57) and 4d shell of Xe (57,58); and the 3s shell of Ar and 4s shell of Kr 

(59) . There is no longer a sharp experimental distinction between the far 
ultraviolet and the soft x-ray regions. In crystalline substances, such excitations 
can also be observed, e.g. of 2p and 2s of fluorine and Is of lithium in LiF 

(60) . . 

A valuable new technique for the accurate measure of the energy of elec¬ 
trons ejected by monochromatic far ultraviolet radiation (photoelectron 
spectroscopy) has allowed the ionization energies of the penultimate orbitals 
of many molecules to be determined (61,62). The agreement with Mulliken’s 
MO classification is very good. The ionization energies of inner shells can be 
shown (Chap. 12, (10)) to vary for a given element roughly to the same 
extent as the energy of the loosest bound electrons. This result was confirmed 
by Watson’s calculations for different ionic charges (14) and by recent, very 
accurate, measurements (63) on compounds of light elements. 

Actually, the definition of one-electron energies is by no means a trivial 
task (10,64). If one desires to include large amounts of two-electron quantities 
in the otherwise far-too-negative values, one really has the choice among 
three sensible possibilities: to take all interelectronic repulsion between non¬ 
core electrons explicitly into account; to consider differences between bari- 
centers of configurations (where the one-electron energies no longer are 
exactly additive); or to correct for effects on shells with low average radii 
having particularly large A x parameters (39). 

It is instructive to remark that atomic spectroscopists in the period from 
1925 to 1930 classified the /-levels observed into configurations with little 
appeal to quantum mechanics for approval. In the author’s opinion, molecules 
and inorganic chromophores were in a similar situation from 1955 to 1965. 
The MO classifications, adapted to the relevant symmetry, work, whether 
or not they ought to. The field of quantum mechanics is in an unpleasant 
position at two interesting points: the manifest existence of quasi-discrete 
states in the continuum, apparently belonging to inner-shell excitations; and 
the fact that because the correlation energy of all neutral atoms heavier than 
neon is larger than the first ionization energy, the backbone is taken out of the 
variation principle. 
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It is not known how the field of quantum mechanics will look in five years, 
or in 500 years. It is the author’s belief that atomic spectroscopists classifying 
energy levels, or chemists inducing them from experimental facts, should 
not worry more about the probable revolutions in physics to come than pilots 
of airplanes or users of vacuum cleaners normally consider relativistic 
or quantum deviations from Newton’s mechanics. The successful classifica¬ 
tion is here to stay. Finally, it may be worth mentioning that the approximate 
transferability of Koopmans’ type of one-electron quantities from one con¬ 
figuration to another is perhaps the most profound property of the one- 
electron approximation. We should not concentrate on the description of a 
single state, but also consider all the low-lying energy levels under the same 
angle in a sort of super-Hilbert-space. The meaning of the “preponderant 
configuration’’ of a state may be slightly more general than 'F(C) in Eq. (1); 
actually, its coefficient may be nearly as small in many-electron systems as it is 
in nuclei. It might be preferable to apply B. Russell’s theory of types (cf. a 
short discussion in a recent paper on a classification of categorical propositions 
(<55))and say that the collection of preponderant configurations is a higher type 
property of the manifold of all the low-lying energy levels of a given system. 
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I. Introduction 

The aim of this paper is twofold. First, we present a general, compact and 
fairly rigorous theory of the linear magnetic response of a quantum mechanical 
system, using the formalism of the thermodynamic free energy functional. 
Secondly, we propose a new, unequivocal and gauge invariant definition of the 
diamagnetic part of the linear magnetic response. 

For illustration consider an atom (without relativistic and spin effects) in a 
homogeneous magnetic field B = (0, 0, B z )). If we fix the origin of the co¬ 
ordinate system, say to the center of the atom, we still have many different 
possibilities of gauging the vector potential A, defined by B = curl A. If we 
choose the vector potential as 

A x =-B z y/ 2, A y = B z x/ 2, A z = 0 

* This work has been supported by the Swiss National Foundation for the Advancement 
of Science. 
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we get the usual result for the susceptibility x 

X = -(e 2 /4mc 2 )<0|£ (x 2 + y^)|0>. (1) 

k 

Taking the Landau gauge 

A x = A z = 0, A y = B z x, 
straightforward perturbation theory gives 

x =-(e 2 lmc 2 K 0lL*, 2 |0> 

k 

+ ( 2e 2 /m 2 c 2 ) £ ( E s ~ E o)~ ‘K^lZ x kPky\s>\ 2 , ( 2 ) 

s^O k 

where in the second term the summation is over all excited states, including 
the continuum. Both expressions are identical of course [the explicit verifica¬ 
tion can be found in the lucid paper of Bloch (1961), from which this example 
is taken], but it is disturbing to find a simple gauge transform converting a 
plain problem in a complicated one. In this example as well as in many others 
the appearance of excited states is a mere mathematical artifact. In the case of 
an atom a symmetry argument shows that the second choice of the gauge is 
not adapted to the problem. In a molecule without symmetry, however, a 
good choice of the gauge is not at all evident. As in Eq. (2), the usual expres¬ 
sion for the susceptibility of a general molecule consists of an inherently 
negative term, involving the ground state wave function only, and of an in¬ 
herently positive term, involving all excited states, including the continuum. 

In theoretical discussions of the susceptibility of molecules (Van Vleck, 
1932, p. 2751T.) and of the magnetic shielding of nuclei in molecules (for a 
review compare, e.g., Abragam, 1961, p. 175fif.) it has become usual to call the 
ground state term the “diamagnetic” part and the term involving the excited 
states the “paramagnetic” part. This partition is not gauge invariant and 
makes therefore no sense unless the gauge chosen is clearly specified. As our 
example shows, a specification of the origin of the coordinate system is not 
sufficient. In Section IV we give a unique procedure to determine an optimal 
gauge, in the sense that the difficult term involving the excited states will be 
as small as possible. Then the response kernel can be separated in a gauge 
invariant manner into two kernels giving this optimal partition, and we 
propose to call these kernels “diamagnetic” and “paramagnetic.” 

We intend to present a careful analysis of the magnetic response of mol¬ 
ecules, making no pretence of strict mathematical rigor. The recent investiga¬ 
tions on Dirac’s ^-function as a linear functional (for a review compare, e.g. 
Gel’fand and Schilow, 1960, 1964; Gel'fand and Vilenkin, 1964) and on 
the validity of the expansion postulate of quantum mechanics (compare Jauch 
and Misra, 1965; Hellwig, 1964; Marlow, 1965) show that with a great 
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deal of effort the Dirac formalism can be given a meaningful mathematical 
interpretation. 

For our purpose full rigor is hardly worth the trouble, and we use the Dirac 
formalism without further apology. In order to have a compact notation we 
often do not distinguish between discreet and continuous variables, summation 
and integration, Kronecker 5 and Dirac 8. The perturbation of the spectrum 
of the Hamiltonian by an external field leads to delicate questions; the mathe¬ 
matical problems involved are far from trivial (compare, e.g., Friedrichs, 
1965). We circumvent these difficulties by the physically sound assumption 
of the existence of a linear continuous response functional; no assumption 
about the nature of the spectrum of the perturbed Hamiltonian will be needed. 

Notation 

Vectors are printed in Roman boldface. We use a rectangular coordinate 
system and the summation convention, Greek indices running from 1 to 3. 
Space coordinates are designated by r = (r { , r 2 , r 3 ); for integrations we use 
the notation d 3 r = dr x dr 2 dr z , all integrals extend over the whole three 
dimensional space. Quantum mechanical operators are printed in capital 
German, superoperators in lowercase German. 

II. Model Assumptions 

A. The thermodynamic approach 

The theory of the response of a system to an external disturbance* is often 
based on the adiabatic perturbation theory which, from the mathematical 
point of view, is exposed to severe criticism. In the case of an equilibrium 
system with a time independent Hamiltonian, however, the thermodynamic 
perturbation theory (reviewed by Nakajima, 1955) is both physically and 
mathematically more satisfactory. Most investigations of the magnetic 
properties of free molecules refer to a pure state and are hampered by the 
difficulties a rigorous discussion of nodes of the wave function and possible 
degeneracies involves (compare, e.g., the discussion by McLachlan and Baker, 
1961). The concept of a “ pure state ” is an idealization never actually realized. 
Using the canonical ensemble may be more realistic and even simpler; the 
ground state properties are then given by the limit of zero temperature. The 
troubles caused by the singular nature of pure states are avoided by working 
out the theory for T # 0, postponing the zero temperature limit to the final 
result. Our main mathematical tool will be a temperature-dependent scalar 

* There is an extensive literature on response theories. Besides the fundamental paper 
of Martin and Schwinger (1959) we should like to select for quotation the paper of Konstanti¬ 
nov and PereP (1960) and the recent summary by Kubo (1965). 
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product for operators, discussed in the appendix. This scalar product becomes 
singular for T = 0 (i.e., for T = 0 the relation <911 $1) = 0 may hold even for 
91 # 0); therefore this limit has to be discussed carefully. 

We restrict our discussion to finite systems and use a statistical ensemble 
containing a fixed number of molecules. Most of the arguments can be ex¬ 
tended to include condensed phases, but our main interest lies in the magnetic 
properties of free molecules. 

B. The operator of the current density 

A molecular Lagrangian depends on external fields linearly only, even if 
relativistic corrections and spin interactions are included. Therefore the 
Lagrangian £[A] of a system under the influence of an external, time indepen¬ 
dent magnetic field B(r) = curl A(r) is given by 

£[A] = £ 0 + (1/c) J d 3 r 3 v (rM v (r), (3) 

where 3 v (r) is the current density operator of the system. The Lagrangian 
£ (q, q, A) is related to the Hamiltonian §(q, p, A) by a Legendre transforma¬ 
tion, hence 

=-(!D 

q = const, q =const q = const, p = const 

In spite of the fact that the Hamiltonian may have a complicated depen¬ 
dence on the external field, the correct current density operator can be obtained 
by a variational derivative of the Hamiltonian, 

3 v (r)= -c*S[A]/W v (r), (4) 

where partial derivatives here and in the following always are in respect to the 
canonical variables (i.e., q = const and p = const). In the canonical repre¬ 
sentation, the current density operator may contain the external field in high 
orders; for the linear response the first order only is needed, 

3v(r) = 3ov( r ) + J d 3 r' 3ivM( rr )^/i( r ) + -> (5) 

where 

3ov( r ) = {3 v(t)}a=o> 

3iv,(rr') = {53v(r)/^(r')} A=0 - 

Hence the Hamiltonian may be written in the following form 
$= §o - (1 /c)jd 3 r 3 0v (rM v (r) 

- (1/2 c)jd 3 rjd 3 r' 3 lv/i (rr')^ v (r)^(r') +••• 


( 6 ) 


( 7 ) 
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C. Specification of a Born-Oppenheimer Hamiltonian 


The discussion in Section IV uses an operator scalar product which is 
positive definite only if the Hamiltonian <r> 0 has a complete set of eigenfunc¬ 
tions; therefore S> 0 should be self-adjoint. To the best of our knowledge this 
problem is not solved for a general molecule. As a result of the pioneering 
work of Kato (Kato, 1951; Ikebe and Kato, 1962; 2islin, 1960) it is known that 
a Born-Oppenheimer Hamiltonian (fixed nuclei) of N electrons with Coulomb 
interactions only is essentially self-adjoint, and has an infinite number of 
bound states with energies below the dissociation energy. If we choose as 
unperturbed Hamiltonian 9) 0 = £>[A = 0] the Born-Oppenheimer Hamil¬ 
tonian without spin interactions and without relativistic terms, i.e., the 
operator 


$0 = I “ Z Z 4Zk\*k - <1*1 1 + Z Z - 4ml S ( 8 ) 

„=1 Am n = l K n<m 

1 

we are therefore on safe ground. From the corresponding Hamiltonian in the 
presence of an external magnetic field B = curl A, the current density operator 
can be deduced according to Eq. (4) to (7). The well-known result is 


3o(r) = — (e 0 /2m) Z (P„<5(r ~ fl„) + <5(r - q„)p n } 

n= 1 
N 

+c Z curl y®»5(r - q„)> 

n = 1 

3iv M (nO = -(e 2 0 /mc)5 vll 5(T - r')91(r), 
with the number density operator 91 

91(r) = £ 5(r - q,). 

n= 1 


(9) 

( 10 ) 

( 11 ) 


The vector r is an arbitrary parameter vector; q„, p n , and S„ are the position 
operator, the momentum operator, and the spin operator of the nth electron. 
e 0 represents the elementary charge, e 0 > 0; the charge of the electron is 
— e 0 and y its gyromagnetic ratio. 


IH. Linear Magnetic Response 

A. The induced current density 

We consider a specimen of substance in thermal equilibrium under the 
action of a magnetic field B(r),* which we assume to be time independent, 

* Today there is no question that E and B are the basic field vectors, the fields D and H 
describing the influence of matter (compare, e.g.. Born and Wolf, 1959, p. 1). Consequently 
(but in contrast to the historical tradition) we refer to E and B (and not to E and H) as to 
the electric and magnetic field vectors. 
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defined throughout all space, quadratic integrable in the sense of Lebesgue, 
and vanishing at infinity at least as a magnetic dipole, |B(r)| = 0(r 3 ).* 
Under the action of the magnetic field B(r) a field of magnetization M(r) is 
established. The magnetization M(r) is an experimentally measurable 
quantity (e.g., by magnetic resonance experiments) and is given by the funda¬ 
mental relation (compare, e.g., Phillips, 1962, p. 27) 

= (12) 

It is convenient to replace the solenoidal vectors B and M by the vector 
potential A and the current density J, respectively. The relations 

B(r) = curl A(r), (13) 

A(r) = A T (r) + grad x(r), div A T = 0, (14) 

J(r) = c curl M(r), div J = 0, (15) 


define J(r) and the transversal vector potential A T uniquely, but the longitu¬ 
dinal part grad % is arbitrary. In the following, however, grad % is assumed to 
be single valued, time independent, and square integrable in the sense of 
Lebesgue; the function % is then called a gauge function. 

In consequence of Eqs. (12) and (15) the induced current density is given 
rigorously by a variational derivative of the free energy, 


J(r) = 


5F 

C MW' 


(16) 


According to Bloch’s theorem (compare, e.g., the discussion by Schafroth, 
1960, p. 405) and due to the time inversal invariance, the current density of 
any system in thermal equilibrium and not subject to external magnetic 
fields (internal fields are permitted, however) vanishes everywhere, i.e., 

{J(r)} A =o = 0. (17) 


Excluding phase transitions, it is physically evident that the induced current 
density as an experimentally measurable quantity has to be a continuous 
functional of the external field. Therefore the linear part J L of the response J 
is a continuous linear functional of the vector potential A, warranting by the 
Riesz representation theorem (compare, e.g., Neumark, 1959) the existence 


* Differentiability is not necessary, the vector operations grad, div, and curl are used in 
the sense of Weyl (1940) or Muller (1957). The usual assumption of a magnetic field ho¬ 
mogeneous over all space is unphysical, leads to severe mathematical difficulties, and 
should therefore not be used. 
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of an integrable kernel F VM (rr'). The linear response J L is therefore given by 


7 v L (r) = ~cjd 3 r' F VM (rr')^ M (r'), 

(18) 

where 

F (it') = ( ^ fIA1 I 

V,A } \SAMSA ii (t')I a = 0 

(19) 

The kernel F v#1 (rr') is real and symmetric, 


F vm ( rr') = {F v#1 (rr')}*, F v#1 (rr') = F MV (r'r). 

(20) 

It goes without saying that the free energy is gauge-invariant, 

i.e., the relation 

F[A + grad y] = F[A] 

(21) 

holds for any gauge function x • Taking the variational derivatives in respect 

to the function /, we get the important relations 


div J L = 0 

(22) 

aF VM (rr')/dh v = dF VM (rr')/dr; = 0 

(23) 


B. The kernel of the linear magnetic response 

If 51 is some observable, the quantum-statistical expectation value <5I> of 
51 with respect to the canonical ensemble of a system with the temperature 
T(fi = 1 jkT) is given by 

<5t> = (1/Z) Tr{51 exp(—/?§)}, (24) 

Z = Tr{exp( - (S$)}, (25) 


where the trace runs over a complete set of states with appropriate permuta¬ 
tion symmetry. From a knowledgef of the partition functional or, equivalent¬ 
ly, the functional of the free energy F, 

—/?F = In Z, (26) 


all expectation values or response kernels can be evaluated. Since no current 
flows in thermal equilibrium in the absence of a magnetic field [Eq. (17)], 
the response kernel F v#i (rr') [Eq. (19)] can be reduced to a second derivative 
of the partition functional Z, 


= ^{; 


<5 2 Z 


<5v4 v (r)<5^ M (r')j A =o 


(27) 


t The partition function and related quantities are most conveniently and systematically 
evaluated in the formalism of the temperature dependent 1-particle Green function (compare 
the excellent introductions by Bonch-Bruevich and Tyablikov, 1962; Kadanoff and Baym, 
1962; Abrikosov et ah, 1963). 
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Using the fact that the trace of a product of operators is invariant under 
cyclic permutations, the second functional derivative of the partition func¬ 
tional can be written as* 


SA V (. 


— -= -j&Trle-'* 


<5 2 § 


-j?Tr 


SA^SA/r') 
<5§ <5 


5AJt') 5A v (r) 


-fiS> 


(28) 


The second term in Eq. (28) is best evaluated with the aid of the following 
operator identity: 


2- exp(— /)&[*)) = -exp( 


(29) 


where ! is the superoperatorf of the Kubo transform, defined by 

!{£} = J* d). exp(A§) X exp(-A§) for any operator X. (30) 

With this relation and using Eqs. (4) and (24), Eq. (28) can be written in the 
form 


1 8 2 Z 

Z dA v (r)5A M (r') 


(fi/cK^V/SA/r')} 

+ (^/c 2 )<l{3v(r)}3,(r')>. 


(31) 


The kernel Eq. (27) is the limit A = 0 of this expression; using Eq. (6) the 
kernel of the linear magnetic response is therefore given by 


F v jj r ') = -(l/c)<3iv M (rr')> 0 

-(l/c 2 )<l o {3ov(r)}3o,(r0)o (32) 

whereby all quantities are in respect to the unperturbed system. The following 
abbreviations are used 


<£>o = Tr{X exp(-/?§o)}/Tr{exp(-j3§ 0 )}, (33) 

1 0 {X} = dX exp(A§ 0 ) 3£ exp(-A§ 0 ). (34) 

Jo 

* In this step we also assumed the functional derivative to be interchangeable with the 
trace, but we are not aware of a rigorous proof (the situation improved by the recent work 
by Langerholc, 1965). Experimentally measurable quantities are rigorously related to 
Eq. (27); if this relation should not be equivalent to Eq. (32), then the current density J 
defined by Eq. (18) would not be given by the quantum statistical expectation value of the 
current density operator. 

t A superoperator is an operator which transforms an operator on the Hilbert space of 
quantum mechanical state vectors into a new operator on this Hilbert space. 
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In the appendix it is shown that the expression 

<to{* + }$>o = <I|?»o (35) 

has all properties of a scalar product. The final result for the linear magnetic 
response may therefore be written in the form 



The kernel K (l) is just an expectation value of a simple operator in respect to 
the unperturbed ensemble, and therefore relatively easy to evaluate. In the 
metric chosen the kernel K {2) is the current-current autocorrelation tensor; 
the positive definite character of the scalar product [Eq. (35)] implies 

J d 3 r J d 3 r ' V,{x)K (2 \xx') V M (r') > 0 

for any admissible vector V # 0, therefore AT (2) is positive definite, K (2) > 0. 
The complete evaluation of K (2) is much more complicated than that of K (1 \ 
but these two kernels are not independent. The gauge relation (23) gives the 
interrelation 

8K ( V 2 \xx’)/dr v = - dK^ \xx')/dr v , (37) 

dK%\xx')/dx' = - dK^(xx')/dr' . (38) 

IV. The Diamagnetism vs Paramagnetism 

A. The optimal gauge for a free molecule 

In this section we restrict our considerations to free molecules. Therefore 
we are supposed to have a system of noninteracting molecules in thermal 
equilibrium at a fixed temperature T# 0. The Hamiltonian of the whole 
system separates then into a sum of molecular Hamiltonians; in the following 
the terms “ free energy,” “ partition function,” and “ Hamiltonian ” refer to a 
single molecule. As molecular Hamiltonian we choose the simple Born- 
Oppenheimer operator Eqs. (7) to (11). 

The free energy per molecule, according to Section III, is given by 

F = F 0 - (1 /c 2 ) {F (1) + E (2) } + 0(A 3 ), 


(39) 
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where F 0 is the free energy of the molecule without an external field and 

F ( '-> = jd 3 r f d 3 r' A v (r)K[%TT’)A,(r') (i = 1, 2). (40) 

Neither F (n nor F (2) is gauge-invariant but their sum depends on the trans¬ 
versal vector potential only. According to Eqs. (10) and (36), F (1) is given by 

F (1) = -(el/m) J d 3 r p 0 (x)A 2 (x), (41) 

where p 0 (x) is the expectation value of the number density in respect to the 
unperturbed ensemble, 

p 0 (r) = <91(r)> 0 . (42) 

Hence F (1) is negative, while the positive definite correlation kernel K (2) 
implies a positive F (2) , 

F (1) < 0, F (2) >0. (43) 

Because of the difficulties involved in the practical evaluation of the kernel 
A (2) it is useful to choose a gauge minimizing F (2) . Maximizing the functional 
F (1) in respect to the gauge function x for a given A r yields the necessary 
condition 


div!p,/f) grad x(r)} + A T ( r) grad = 0 


(44) 


This elliptic differential equation has to be solved for x (compare Section 
1V,B). There is a unique solution that we call the optimal gauge function * for 
the transversal vector potential A T . In this gauge F (1) is maximal and given by 

Folt = ~ (el/m) J d 3 r p 0 (r){[A T (r)] 2 - [grad Z op t( r )] 2 }- (45) 


B. A GAUGE INVARIANT PARTITION OF THE MAGNETIC RESPONSE 

It is useful to state the result of the preceding section in a gauge invariant 
manner. Equation (44) has to be solved in an infinite region for the optimal 
gauge function x = Xo P t • To this end we consider the corresponding eigenvalue 
problem, 

div{po(r) grad </>„(r)} + a„</>„(r) = 0. (46) 

According to the theory of generalized eigenfunctions (Gel’fand and Schi- 
low, 1964, p. 170) and to a theorem of L. Schwartz (compare, e.g., Bers and 


* Similar-looking equations and the problem of an optimal gauge have been discussed by 
Stephen (1957), Rebane (1960), McLachlan and Baker (1961), Guy et al. (1961), and others. 
Their equations were not elliptic and the existence of solutions could not be proved, prob¬ 
ably they do not have regular solutions. 
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Schechter, 1964, p. 139) an elliptic operator over an infinite region has a 
complete set of square integrable eigenfunctions, provided p 0 (r) is infinitely 
differentiable. Furthermore all eigenvalues a„ are positive what can be proved 
as follows. From Eqs. (10), (36), and (38) the following kernel representation 
of the operator div p 0 grad is obtained 

Jr/V £>(rr')/(r') = -(el/m) div{p 0 (r) grad/(r)}, (47) 

with 

or with Eq. (37), 

D( rr') = <div 3 0 (r) | div 3„(r')>„ . (49) 


In the appendix it is shown that <£ | £> 0 > 0 for X # 0, hence the kernel 
D(rr') (in the space of quadratically integrable functions) is positive definite 
and all eigenvalues of D are positive. 

Therefore an optimal gauge function x op t exists, is uniquely determined, 
and given by 

XoptW = Z Oto.WnO’) f d3 f' </> n (r')A T (r') grad p 0 (r'), (50) 

n J 


where the eigenfunctions are taken orthonormalized, 

J d 3 r 0 n (r)0 m (r) = <5 nffl . (51) 

In an arbitrary gauge x the contribution of the kernel K (1) to the current 
density is given by 

cy^>(r) = J d 3 r' K/ / l ) (Tt'){A T ^t') + dx/dr/}. (52) 

In the optimal gauge x = Xo P t> J (1) = J opt > Eq. (52) can be written as 

cJ°A r) = J rfV JST®(rr 'K(r'), (53) 

where, according to Eq. (50), the new kernel K D is given by 


A°(rr') = -(el/m)p 0 (r)5^5(r-r') 


+ ( el/m)p 0 (r)p 0 (r ') 


Z 


1 d<ft„(r) d<f) n (r r ) 
a„ dr v dr/ 


(54) 


In general J (1) defined by Eq. (52) is not solenoidal. It is therefore remarkable 
that J opl is a true current, i.e., 


div J opt = 0 


(55) 
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as a simple calculation using Eq. (46) and the completeness relation for the 
functions <f) n shows. Equation (55) is identical with the relations 

dK^(rr')/dr v = d^(rr')/dr; = 0 (56) 

the second following from the symmetry of the kernel K D . The validity of the 
gauge relations Eq. (56) permits a calculation of the current J opt in a gauge 
invariant manner, i.e., the current density J D defined by 

cJ d v (t) = J d 3 r' /^(rr')^(r') (57) 

may be calculated in any gauge of the vector potential and is equal to J opt . 

C. Conclusions 

The conventional splitting of the response kernel K according to Eq. (36) 
into the two kernels and K (2) is not gauge invariant and the quantities, 

\d 3 r' 7^(rr')^(r') O'=1.2), 

are in general not solenoidal and therefore cannot be interpreted as partial 
currents. 

Imposing physically reasonable restrictions (the external magnetic field 
approaches zero at large distances like r~ 3 ; nonzero temperature), we have 
given an optimal and gauge invariant partition of the response kernel, 

K=K d +K p , (58) 

where K D is defined by Eq. (54). The gauge invariance is reflected by the 
relations 

8K^(TT')/dr v = 0A® (rrO/ar; = 0, 

(59) 

8K P tl (rr')/dr v = dK^n^/dr^ = 0, 

implying that the linear response J L consists of two solenoidal and gauge 
invariant currents J D and J p , 

J L (r) = J D (r)+ J p (r), 

n (60) 

div J D = div J p = 0, 

which can be calculated from the relations 

c7f(r) = \d 3 r'K^{rr')A^'\ 


cj p v (r)=\d 3 r' 


(61) 
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in any gauge of the vector potential. This partition is optimal in the sense that 
both K D and K p are maximal negative, i.e.,* 

K D ^ A (1) <0, 0 < K p ^ K (2) . (62) 

This relation holds for T > 0; for T = 0 there may be exceptional situations 
(e.g., for atoms at T = 0 the contribution of K p may vanish). The kernel 
K d [Eq. (54)] may be calculated by a knowledge of the charge density 
— c 0 p 0 (r) of the unperturbed system only. For a given external field the 
direct solution of Eq. (44) may be more practical, the current J D is then 
given by 

J°(r) = -(e 2 0 /mc)p 0 (r){A T (r) + grad x O pt0»- (63) 

The kernel K p is given by K — K° and is the true current-current autocorrela¬ 
tion tensor. 

Because of these unique properties we propose to call J D and J p diamagnetic 
and paramagnetic current density respectively; K D the diamagnetic part and 
K p the paramagnetic part of the response kernel K. All linear magnetic proper¬ 
ties of a molecule are determined by the response kernel, therefore the concepts 
of a diamagnetic and paramagnetic part say of the susceptibility tensor 
X V4 (rr') or the chemical shift tensor cr VM (rr') are uniquely and optimally defined, 
e.g., 

X° ^ X (1) < 0, 0<x p ^x (2) , (64) 

<7° g <7 (1) <0, 0 < <7 P ^ <7 (2) . (65) 


Appendix. A Temperature Dependent Scalar Product for Operators 

It is convenient to introduce a metric into the algebra of the operators on 
the Hilbert space of quantum mechanical state vectors. In this operator 
algebra, a scalar product is a scalar valued function of two operators X and 
9), written <£ 19)>, such that 

(a) <*|3)> = <3)|S>* 

(b) <3£|a 1 9) 1 + a 2 ?) 2 > = a 1 <^|9) 1 > + a 2 a|9) 2 > (a t = scalar) 

(c) <3£ | 3£> > 0 for X ^ O (O = zero operator) 

For a response theory in respect to an unperturbed canonical ensemble with 

* K> 0 means a positive definite kernel, J d 3 r f d 3 r'V v (r)K vu (rr')V u (r') for all admissible 
vectors V. K>L means that K — L is positive definite. 
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the Hamiltonian the following definition of a proper scalar product is 
appropriate:! 

<*ID> = <f{* + }9> 

= dX Tr{e" w ^3£ + e"^?)}/Tr{e" w } t 

where k is the superoperator of the Kubo transform [Eq. (30)]. The verifica¬ 
tion of the conditions (a) and (b) is trivial, just as that of the additional 
relation 

ai«> = OD + i* + >. 

This scalar product is tailored to a problem with the Hamiltonian §. The 
derivation superoperator 1) of the Hamiltonian defined by 

f)(£) = [£>, $] (f° r all operators X) 

is hermitean in respect to this metric, i.e., 

<*|f)(9)> = <*)(*) I *>• 

The positive definite character of this scalar product depends essentially on 
the Hamiltonian used, the property (c) holds if and only if the Hamiltonian 
has a complete set of eigenfunctions. Let {\p n } be a complete, orthonormalized 
system of eigenfunctions of the Hamiltonian, 

$'/'„ = En'l'n* Ipm) = tnn,’ 

= o hmj- 

In this representation, the norm of X is given by 

<X\X> = 0/Z) Z Z \*dk exp (-/?£•„ + XE n - XEJ X* mn X mn 

n m J 0 

= (l/Z) 1 1 IA-,.1 2 

n m — h m 

(En*E m ) 

+ 0/-^IZlA- B J 2 exp(-^£ n ). 

n m 

(En = E m ) 

If X is not the zero operator, there exists at least one X nm different from zero, 
whence (X \ X} > 0 for 0 < p oo. For = 0 or T = 0, the scalar product be¬ 
comes semidefinite. 

t This scalar product was used by Nakano (1960, 1963) and by Mori (1965), but its 
important property of positive definiteness was neither used nor proved. 
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I. Introduction 

Pi-electron theory has proven to be a useful theory in explaining molecular 
electronic and magnetic spectra, ionization potentials, and relative reactivity 
where this can be interpreted in terms of electron-density distribution. The 
theory developed rather intuitively via Hiickel theory and was given a more 
quantitative foundation by Goeppert-Mayer and Sklar. Essential simplifi¬ 
cations introduced by Pariser, Parr, and Pople led to a very tractable and 
successful parameterized theory. Finally, the pi-electron approximation was 
given a general theoretical formulation by Lykos and Parr (Lykos, 1964). 
Now that this adequate theoretical framework has been developed, the next 
step is to test it by first making ab initio calculations on selected systems, and 
then analyzing these within the theoretical framework of pi-electron theory. 

There exists a substantial gap between accurate ab initio calculations for 
small molecules such as C1J (Wahl and Gilbert, 1966) and semiempirical 

* A preliminary account of this work was given by P. G. Lykos, R. B. Hermann, J. D. 
Sharp-Ritter, and R. Moccia, Bull. Am. Phys. Soc. 9, 145 (1964). 

t Present address: Analytical Research Department, Eli Lilly and Company, Indianapolis 
Indiana. 

J Present address: Roosevelt University, Chicago, Illinois. 

§ Present address: University of Pisa, Pisa, Italy. 

If A continuation of this work and an extension to heterocyclics is given in: Molecular 
Fragments of Heterocyclic Aromatic Compounds, J. D. Sharp-Ritter, Ph.D. Thesis, 
Illinois Inst. Tech. (1965). 
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calculations on larger molecules such as purine. While the formalism of the 
ab initio approaches has now come to be more and more widely used as a basis 
for improved semiempirical theories, ultimately a representation for the 
molecular orbitals involved needs to be specified and the corresponding one- 
and two-electron integrals somehow evaluated if numbers relating to physical 
observables are to be produced. There have been two separate levels of approxi¬ 
mation used here. 

The first level has been concerned with the identity and the size of the basis 
set to be used for representing the molecular orbitals for the system. It has 
been common for various workers to use a single Slater orbital, or Slater-type 
orbital at each nucleus per electron or per electron pair brought to the molecule 
by the corresponding atom. More recently Hartree-Fock orbitals found for 
the corresponding free atom have been used in constructing molecular orbitals 
for the molecular system. In the case where single Slater-type orbitals have 
been used there have been some attempts to “condition” the atomic orbital 
to its molecular environment by adjusting the scale of the orbital empirically. 
However, all of these procedures prove to be inadequate when tested within 
the framework of a simple system where higher accuracy can be achieved. 
The mostpromisingnewdevelopmentfortf6/w7/c>workappears tobethat where 
Hartree-Fock orbitals for the corresponding free atoms are used as building 
blocks and appropriate additional Slater-type orbitals introduced to allow for 
changes in the outermost atomic orbitals which are most affected by the process 
of bond formation (Das and Wahl, 1966). In semiempirical work the problem 
of selection of a basis has been circumvented to a large extent by considering 
that the molecule is made up of atomlike parts where such atomlike parts can 
be identified with free atoms or smaller molecules existing in appropriate 
valence states. Then experimentally determined quantities are used to “ evalu¬ 
ate ” those molecular integrals relating to the atomlike parts (Cusachs and 
Reynolds, 1965). However, no satisfactory technique has been evolved which 
assesses the interaction between such atomlike parts without necessitating 
actual evaluation of integrals over explicit atomlike wave functions. 

The second level of approximation has been in the evaluation of the one- 
and two-electron integrals that arise once the basis for the representation has 
been fixed. Here it has been customary to use the Mulliken or Sklar or even 
rougher approximations in order to reduce all integrals to combinations of 
one-center integrals and two-center Coulomb-type and overlap integrals 
(Fischer-Hjalmars, 1965). Furthermore, in dealing with conjugated systems, 
only some of the electrons are handled explicitly and the effect of the remaining 
electrons is assessed in some average way, usually empirically. It has been at 
this level where the most striking breakthroughs have been made in the recent 
past in semiempirical pi-electron theory. In fact, in the more successful pi- 
electron theories such as the Pariser-Parr and Pople theories, no atomic orbital 
is ever explicitly specified (Parr, 1963). More recently, however, it has become 
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clear that, even in dealing with semiempirical approaches, the overlap integrals 
between wave functions representing different atom- or molecular fragment¬ 
like parts need to be calculated as these quantities cannot be inferred from 
experiment (Adams and Miller, 1966). 

The present state of development of electronic digital computers and the 
corresponding coordinated molecular wave function computer programs is 
such that fairly accurate Hartree-Fock and Extended Hartree-Fock wave 
functions can be determined for small molecules such as hydrides and diatomic 
molecules made up from first and second row elements. Accordingly, even 
though ab initio accurate calculations on large molecules may not be feasible 
at this time, it is possible to define models for portions of larger molecules for 
which fairly accurate Hartree-Fock or Extended Hartree-Fock solutions can 
be found. The present work constitutes a case in point. 

In this paper we report and analyze the results of large-scale computations 
on the simplest molecules involving pi electrons, namely, the planar molecules 
CH 3 CH 3 , and CHJ The methyl positive ion is included as part of the 
homologous series even though it has no pi electrons while the radical and the 
negative ion have one and two pi electrons, respectively. One advantage of 
treating these simplest systems is that the analysis is not complicated by any 
of the ancillary approximations usually appended to the basic pi-electron 
approximation. In addition the sensitivity of the “core” to change in pi-electron 
density may be assessed in going from no pi electrons, to one pi electron, to 
two pi electrons. 


II. Description of Calculation 

The theoretical framework within which the work was done was the Hartree- 
Fock-Roothaan method (Roothaan, 1960). A one-center representation cen¬ 
tered on the carbon nucleus was used (Moccia, 1964). Five basis functions were 
used for the pi orbital and twenty-two basis functions used to represent the 
core electrons. These basis functions were real Slater-type orbitals (STO’s) and 
the scale factors, treated as variational parameters, were selected optimally 
to within + 0 . 1 . 

More specifically, an f 0 function was added variationally to the four p 0 orbital 
basis for the pi orbital in order that oblateness of the charge distribution would 
be permitted should this prove to be energetically favorable within this pro¬ 
cedural framework. For the core, several basis functions of s, p, d, and f-type 
symmetry were employed in order to represent the inner-shell charge 
density about the carbon nucleus, the charge density between the several 
nuclei, and the charge density about the protons. There was no attempt to use 
very large powers of r in the STO’s to improve the charge distribution about 
the protons (Joy and Handler, 1965). 
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The size of the basis for the peel as well as the core was adjusted in order to 
determine the sensitivity of the size and shape of the pi orbital to the basis 
for the representation. As was expected, refinement of the core description, 
especially in the vicinity of the protons, did not alter the pi orbital significantly. 

A one-center and a two-center STO representation for a Hartree-Fock- 
Roothaan treatment of NH were compared recently by Lounsbury (1965) and 
he found that the nitrogen inner-shell orbitals and the NH pi orbitals obtained 
with the two representations compared very well. 

The best wave functions obtained for the three fragments CH 3 , CH 3 and 
CHJ are given in Tables 1, 2, and 3. In addition, the best wave function for 
CH 3 where only p 0 orbitals were used to represent the pi orbital is given in 
Table 4. 


TABLE 1 


CH 3 Positive Ion" 


N 

L 

M 

Orb Exp. 


Eigenvectors 


1 

0 

0 

5.3820 

0.923458 

-0.247883 

0 . 

0 . 

1 

0 

0 

9.1520 

0.083020 

0.015562 

0 . 

0 . 

3 

0 

0 

3.0370 

0.012900 

—0.080459 

0 . 

0 . 

2 

0 

0 

1.6840 

-0.010936 

1.010937 

0 . 

0 . 

2 

1 

1 

0.9740 

- 0 . 

- 0 . 

0.018656 

0 . 

2 

1 

1 

1.6840 

- 0 . 

- 0 . 

0.497836 

0 . 

2 

1 

1 

3.5540 

- 0 . 

- 0 . 

0.100648 

0 . 

2 

1 

-1 

0.9740 

- 0 . 

- 0 . 

0 . 

0.018656 

2 

1 

-1 

1.6840 

- 0 . 

- 0 . 

0 . 

0.497835 

2 

1 

-1 

3.5540 

- 0 . 

- 0 . 

0 . 

0.100648 

3 

2 

0 

1.8000 

-0.000289 

-0.239525 

0 . 

0 . 

3 

2 

0 

2.3000 

-0.000749 

0.104640 

0 . 

0 . 

3 

2 

2 

1.8000 

- 0 . 

- 0 . 

0 . 

-0.384982 

3 

2 

2 

2.3000 

- 0 . 

- 0 . 

0 . 

0.184554 

3 

2 

-2 

1.8000 

- 0 . 

- 0 . 

-0.384982 

0 . 

3 

2 

-2 

2.3000 

- 0 . 

- 0 . 

0.184554 

0 . 

7 

3 

1 

3.3000 

- 0 . 

- 0 . 

-0.083356 

0 . 

7 

3 

-1 

3.3000 

- 0 . 

- 0 . 

0 . 

-0.083356 

7 

3 

-3 

3.3000 

-0.000409 

-0.120242 

0 . 

0 . 

4 

0 

0 

2.0000 

0.004110 

0.089028 

0 . 

0 . 

4 

1 

1 

2.0000 

- 0 . 

- 0 . 

0.435169 

0 . 

4 

1 

-1 

2.0000 

- 0 . 

- 0 . 

0 . 

0.435170 

Eigenvalues* 


-11.631391 • 

-1.262828 

-0.917259 

-0.917259 

Electronic energy* = 

-48.82312632 




Nuclear repulsion energy = 

9.80425858 




Total 

energy 

= 

-39.01886749 





“ The nuclear configuration is planar with all protons 2.0126 bohrs from the carbon 
atom and defining an equilateral triangle. 

6 All energies are expressed in hartrees. 
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TABLE 2 
CH 3 Radical" 


N 

L 

M 

Orb. exp. 



Eigenvectors 



2 

1 

0 

0.9550 

0 . 

0 . 

0 . 

0 . 

0.434626 

2 

1 

0 

1.4209 

0 . 

0 . 

0 . 

0 . 

0.438395 

2 

1 

0 

2.5880 

0 . 

0 . 

0 . 

0 . 

0.193138 

2 

1 

0 

6.3400 

0 . 

0 . 

0 . 

0 . 

0.011716 

7 

3 

0 

2.9000 

0 . 

0 . 

0 . 

0 . 

-0.028428 

1 

0 

0 

5.3820 

0.923116 

-0.224453 

0 . 

0 . 

0 . 

1 

0 

0 

9.1520 

0.083105 

0.009811 

0 . 

0 . 

0 . 

3 

0 

0 

3.0370 

0.012616 

-0.018086 

0 . 

0 . 

0 . 

2 

0 

0 

1.6840 

-0.009322 

0.848610 

0 . 

0 . 

0 . 

2 

1 

1 

0.9740 

0 . 

0 . 

0.121023 

0 . 

0 . 

2 

1 

1 

1.6840 

0 . 

0 . 

0.419144 

0 . 

0 . 

2 

1 

1 

3.5540 

0 . 

0 . 

0.087917 

0 . 

0 . 

2 

1 

-1 

0.9740 

0 . 

0 . 

0 . 

0.121024 

0 . 

2 

1 

-1 

1.6840 

0 . 

0 . 

0 . 

0.419143 

0 . 

2 

1 

-1 

3.5540 

0 . 

0 . 

0 . 

0.087917 

0 . 

3 

2 

0 

1.8000 

0.000027 

-0.281292 

0 . 

0 . 

0 . 

3 

2 

0 

2.2000 

-0.000627 

0.146625 

0 . 

0 . 

0 . 

3 

2 

2 

1.8000 

0 . 

0 . 

0 . 

-0.484756 

0 . 

3 

2 

2 

2.2000 

0 . 

0 . 

0 . 

0.271046 

0 . 

3 

2 

-2 

1.8000 

0 . 

0 . 

-0.484755 

0 . 

0 . 

3 

2 

-2 

2.2000 

0 . 

0 . 

0.271046 

0 . 

0 . 

7 

3 

1 

3.2000 

0 . 

0 . 

-0.092244 

0 . 

0 . 

7 

3 

-1 

3.2000 

0 . 

0 . 

0 . 

-0.092244 

0 . 

7 

3 

-3 

3.2000 

-0.000383 

-0.126107 

0 . 

0 . 

0 . 

4 

0 

0 

1.9000 

0.003330 

0.205516 

0 . 

0 . 

0 . 

4 

1 

1 

1.9000 

0 . 

0 . 

0.426057 

0 . 

0 . 

4 

1 

-1 

1.9000 

0 . 

0 . 

0 . 

0.426057 

0 . 

Eigenvalues 


-11.188883 

-0.887634 

-0.536939 

-0.536939 

-0.362691 

Electronic energy 

= -49.14288664 




Nuclearrepulsion energy 

= 9.80425858 




Total energy 


= -39.33862782 





«The nuclear configuration is planar with all protons 2.0126 bohrs from the carbon atom and 
defining an equilateral triangle. 
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TABLE 3 
CH 3 Negative Ion” 


N 

L 

M 

Orb. exp. 



Eigenvectors 



2 

1 

0 

0.6370 

0. 

-0. 

0. 

0. 

0.590621 

2 

1 

0 

1.3380 

0. 

-0. 

0. 

0. 

0.371632 

2 

1 

0 

2.5820 

0. 

-0. 

0. 

0. 

0.173049 

1 

3 

0 

2.9000 

0. 

-0. 

0. 

0. 

-0.041928 

2 

1 

0 

6.3000 

0. 

-0. 

0. 

0. 

0.007288 

1 

0 

0 

5.3820 

0.923091 

-0.200160 

0. 

0. 

-0. 

i 

0 

0 

9.1520 

0.093176 

0.000449 

0. 

0. 

-0. 

3 

0 

0 

3.0370 

0.012454 

0.120689 

0. 

0. 

-0. 

2 

0 

0 

1.6840 

-0.009324 

0.607702 

0. 

0. 

-0. 

2 

1 

1 

0.9740 

0. 

-0. 

0.268957 

0. 

-0. 

2 

1 

1 

1.6840 

0. 

-0. 

0.345042 

0. 

-0. 

2 

1 

1 

3.5540 

0. 

-0. 

0.086262 

0. 

-0. 

2 

1 

-1 

0.9740 

0. 

-0. 

0. 

0.268956 

-0. 

2 

1 

-1 

1.6840 

0. 

-0. 

0. 

0.345043 

-0. 

2 

1 

-1 

3.5540 

0. 

-0. 

0. 

0.086262 

-0. 

3 

2 

0 

1.8000 

0.000445 

-0.280525 

0. 

0. 

-0. 

3 

2 

0 

2.2000 

-0.000900 

0.146692 

0. 

0. 

-0. 

3 

2 

2 

1.8000 

0. 

-0. 

0. 

-0.499444 

-0. 

3 

2 

2 

2.2000 

0. 

-0. 

0. 

0.283946 

-0. 

3 

2 

-2 

1.8000 

0. 

-0. 

-0.499444 

0. 

-0. 

3 

2 

-2 

2.2000 

0. 

-0. 

0.283947 

0. 

-0. 

7 

3 

1 

3.2000 

0. 

-0. 

-0.096824 

0. 

-0. 

7 

3 

-1 

3.2000 

0. 

-0. 

0. 

-0.096824 

-0. 

7 

3 

-3 

3.2000 

-0.000373 

-0.127223 

0. 

0. 

-0. 

4 

0 

0 

1.9000 

0.003492 

0.329126 

0. 

0. 

-0. 

4 

1 

1 

1.9000 

0. 

-0. 

0.351158 

0. 

-0. 

4 

1 

-1 

1.9000 

0. 

-0. 

0. 

0.351158 

-0. 

Eigenvalues* 

— 

10.909549 

-0.613559 

-0.262555 

-0.262555 

0.007237 


Electronic energy =—49.07761574 

Nuclear repulsion energy = 9.80425858 

Total energy = -39.27335691 


“ The nuclear configuration is planar with all protons 2.0126 bohrs from the carbon atom and I 
defining an equilateral triangle. 

b All energies are expressed in hartrees. 
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TABLE 4 
CH 3 Radical" 


N 

L 

M 

Orb. expt. 



Eigenvectors 



2 

1 

0 

0.9550 

- 0 . 

0 . 

0 . 

0 . 

0.434613 

2 

1 

0 

1.4209 

- 0 . 

0 . 

0 . 

0 . 

0.438097 

2 

1 

0 

2.5880 

- 0 . 

0 . 

0 . 

0 . 

0.194056 

2 

1 

0 

6.3400 

- 0 . 

0 . 

0 . 

0 . 

0.011668 

1 

0 

0 

5.3820 

0.923115 

-0.224419 

0 . 

0 . 

0 . 

1 

0 

0 

9.1520 

0.083108 

0.009848 

0 . 

0 . 

0 . 

3 

0 

0 

3.0370 

0.012602 

-0.018992 

0 . 

0 . 

0 . 

2 

0 

0 

1.6840 

-0.009307 

0.849425 

0 . 

-0.000002 

0 . 

2 

1 

1 

0.9740 

- 0 . 

0 . 

0.118610 

0 . 

0 . 

2 

1 

1 

1.6840 

- 0 . 

0 . 

0.419318 

0 . 

0 . 

2 

1 

1 

3.5540 

- 0 . 

0 . 

0.087866 

0 . 

0 . 

2 

1 

-1 

0.9740 

- 0 . 

0 . 

0 . 

0.118611 

0 . 

2 

1 

-1 

1.6840 

- 0 . 

0.000002 

0 . 

0.419317 

0 . 

2 

1 

-1 

3.5540 

- 0 . 

0 . 

0 . 

0.087866 

0 . 

3 

2 

0 

1.8000 

0.000152 

-0.281848 

0 . 

0 . 

0 . 

3 

2 

0 

2.2000 

-0.000737 

0.145971 

0 . 

0 . 

0 . 

3 

2 

2 

1.8000 

- 0 . 

0 . 

0 . 

-0.486012 

0 . 

3 

2 

2 

2.2000 

- 0 . 

0 . 

0 . 

0.271774 

0 . 

3 

2 

-2 

1.8000 

- 0 . 

0 . 

-0.486012 

0 . 

0 . 

3 

2 

-2 

2.2000 

- 0 . 

0 . 

0-271774 

0 . 

0 . 

7 

3 

1 

3.2000 

- 0 . 

0 . 

-0.093917 

0 . 

0 . 

7 

3 

-1 

3.2000 

- 0 . 

0 . 

0 . 

-0.093917 

0 . 

7 

3 

-3 

3.2000 

-0.000383 

-0.126468 

0 . 

0 . 

0 . 

4 

0 

0 

1.9000 

0.003324 

0.205282 

0 . 

-0.000001 

0 . 

4 

1 

1 

1.9000 

- 0 . 

0 . 

0.428020 

0 . 

0 . 

4 

1 

-1 

1.9000 

- 0 . 

0 . 

0 . 

0.428019 

0 . 

Eigenvalues * * 6 


- 11.189321 

-0.887997 

-0.538903 

-0.538903 

-0.361982 


Electronic energy 6 = —49.13579464 

Nuclear repulsion energy = 9.804255858 

Total energy = —39.33153582 


0 The nuclear configuration is planar with all protons 2.0126 bohrs from the carbon atom and 

lefining an equilateral triangle. 

6 All energies are expressed in hartrees. 
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III. Discussion 

According to Koopmans’ approximation (Koopmans, 1933), the ionization 
potential is just the orbital energy e of the pi electrons. In pi-electron theory it 
is customary to maintain a fixed core in order to calculate two different states 
or the ionization potential of a system. The fixed core ionization potential of 
CH 3 is just equal to the orbital energy of the pi-electron. It can be seen that 
for the system CH 3 , the fixed core approximation is in error by 10% of the 
true HF ionization potential, so that the core readjustment does affect the IP 
as can be seen by examination of vectors in Tables 1 and 2. The correlation 
energy of the pi electron with the core is estimated at .934 eV (Hermann, 1965), 
so that the fixed-core approximation compensates for most of this neglect. 
The vertical electron affinity for CH 3 was found here in the HF approximation 
to be slightly positive at +1.77 eV. The increment in the total correlation 
energy of adding a pi electron to planar CH 3 is estimated to be —1.71 eV 
(Hermann, 1965). It is interesting that here, too, the constant core approxi¬ 
mation again partly compensates for the neglect of correlation effects on the 
electron affinity. 

The nature of the basis set used in the representation of one-electron MO’s 
is usually taken to be one STO on each center with the Slater-Zener (-value 
of 1.59 occasionally adjusted to give correct atomic valence state results. 



Fig. 1. The comparison of radial plots of the pi orbitals of the carbon atom in various 
systems. The STO with ( = 1.405 is the best compromise STO found empirically by Adams 
and Miller (1966). 
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In a pi system the core potential in which the p z electron moves should be 
better represented by assuming the core to be locally more similar to the core 
of CH 3 than a point charge of 3.18. 

It is interesting to compare the p z orbital of the planar methyl radical with 
that of the Hartree-Fock valence state carbon atom (Fig. 1). If it is assumed 
that a basis set for pi-electron systems may be constructed from the localized 
p z orbital of the planar methyl radical rather than atomic SCF orbitals or 
atomic Slater-type orbitals, a revision in a number of the one- and two-center 
integrals used in Pariser Parr theory is implied (Table 5). 


TABLE 5 

One and Two Center Coulomb-Type Electron 
Repulsion Integrals (eV) 


R (angstroms) 

0 

1.4 2.425 

2.8 

STO (£=1-59) 

16.93 

9.03 5.67 

4.97 

Carbon atom SCF 

15.55 

8.67 5.57 

4.89 

CH 3 radical" 

14.19 

8.42 5.50 

4.85 

Overlap Integrals 

R (angstroms) 

1.4 

2.425 

2.8 

STO (C = 159) 

0.26 

0.0389 

0.0177 

Carbon atom SCF 

0.3269 

0.0839 

0.0493 

CHj radical" 

0.3797 

0.1124 

0.0693 


“ These were obtained using the pi orbital defined in 
Table 4. 

For the prediction of ground-state properties of aromatic molecules, such 
as charge densities and ground-state ionization potentials, CH 3 pi orbitals 
should make a good basis set. It is conceivable that some sort of interpolation 
between the systems given here with integer numbers of pi electrons might be 
effected in order to accommodate carbon atomlike parts of molecules with 
noninteger numbers of electrons. The results should be compared with HF 
values rather than exact values because of the neglect of correlation in the 
one-center repulsion integral and also of right-left correlation. Comparison 
with exact results may be easier if the (11/11) integral is reduced by the corre¬ 
lation energy present in adding one pi electron to the entire atom, or 1.7 eV, 
and right-left correlation with configuration interaction is included. 

IV. Conclusion 

Accurate ab initio orbital treatment of large asymmetrical molecules is not 
possible at the present time. Accordingly, semiempirical approaches need to 
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be employed. However, accurate ab initio treatment of small molecules is 
possible at the present time. Accordingly, it is possible to transcend the usual 
approaches to molecular systems which involve using wave functions for free 
atoms as a basis for representing molecular orbitals. The work reported here 
is an illustration of what can be done in this regard and reveals that there can 
be considerable difference in the atom-like fragment wave functions to be 
used in a molecule according as these are found for a free atom or a molecular 
fragment. We wish to acknowledge financial assistance from the United States 
Public Health Service. 
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I. Introduction 

Charge transfer complexes are molecular or supramolecular entities formed 
from two (sometimes more) ordinarily stable molecular components through 
a more or less complete transfer of an electron from one of the components 
(the electron donor) to the other (the electron acceptor). Following the 
quantum theory of the phenomenon, which is the only satisfactory one 
(Mulliken, 1952a and b, 1956, 1964, general reviews: McGlynn, 1958, 
1960; Briegleb, 1961; Andrews and Keefer, 1964), the interaction of an 
electron-donor (D) with an electron-acceptor (A) may be described by saying 
that when D and A combine to form a complex, the wave function for their 
association may be written approximately: 

^N = ^(DA) + ^(D + A-), *>b, 

for the ground state, and 

Vb = b*V(T>A) - ^(D + A-), a* > b *> 

for the excited state. 

In these expressions 'F(da) denotes the so called no-bond wave function, 
it means the wave function corresponding to a structure in which the binding 
of the two components is effected by the “classical” intermolecular forces 
(the electrostatic, dispersion, H-bonding, etc., forces), while ^(d+a-) denotes 

* This work was supported by Public Health Service Research Grant No. GM 12289-01 
from the National Institute of General Medical Sciences. 
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the so-called dative-bond wave function, corresponding to a structure of the 
complex in which one electron has been transferred from D to A and in 
which besides the forces listed above there is also a weak chemical binding 
between the odd electrons situated on the two components of the complex. 
It can be seen that the charge transfer is generally more pronounced in the 
excited state of the complex than in its ground state. The transition from the 
ground to the excited state is frequently associated with the appearance of a 
new absorption band, situated generally toward long wavelengths, which is 
the essential and practically the only unambiguous indication of the formation 
of a charge transfer complex, although such a formation may also be associated 
with the appearance of other new characteristics: dipole moments, enhance¬ 
ments of semiconductivity or of chemical reactivity, etc. In so far as charge 
transfer complexes may be intermediates in the generation of free radicals, 
useful information about their formation may be obtained sometimes by 
electron-spin-resonance spectroscopy (Isenberg, 1964). 

The consideration of the energy quantities involved in the formation of the 
ground state and the transition to the excited state may be obtained by 
solving the appropriate secular determinant. On writing the expressions for 
Tn and 'Fe in the simplified forms 

'F N = «'F 0 + M' 1 , 

'¥ E = b*'¥ 0 -a*'¥ l , 


this looks as follows: 


where 


E 0 -E 
Ho i — ES 01 


H 01 - ES, 


01 


E t -E 


= 0 , 


E 0 = j ^oH^o dx, E x = J 'F 1 /74' 1 dr, 

H 01 = [•v 0 H'V l dx, s 01 = !>„% rft = 75 T—rrrn- 

J J (1 + S£ a ) 1/2 

Eq is the energy associated with 4 / 0 , i.e., the sum of the separate energies of 
D and A modified by any energy of attraction arising from forces other than 
CTC, while E t includes the energy of attraction between the charged species 
and the covalent binding between the odd electrons situated on the two com¬ 
ponents. H 0l is the interaction energy of DA with D + A", H being the exact 
Hamiltonian of the entire set of nuclei and electrons of the complex. ^ DA = 
j 4>d4>k dx is the overlap integral between the highest filled molecular orbital 
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of the electron donor (<^ D ) and the lowest empty orbital of the electron ac¬ 
ceptor (<£ a ). 

It is easily shown that the energies of the ground and of the excited states 
can be approximated by 

„ „ (#01 - ^o^oi) 2 

E n = E 0 -- —- -, 

t x — h 0 


E e = E i + 


(H ox EjSQi) 

E x — E 0 


and the transition energy between them, corresponding to the charge transfer 
band, is thus 


A E — E e — Eft — E x 


E o + 


(i/ 01 - EiSptf + (Hqi - EpSpif 
E l — E 0 


which in the first approximation may be shown to be reducible to 

A E = Iq — E a + A 

where / D is the ionization potential of the donor, E A the electron affinity of 
the acceptor, and A a stabilization term. The equation signifies that if we 
consider a constant acceptor and vary the electron donors, and provided A 
is approximately constant, we may expect to observe a linear relation between 
the frequency of the charge transfer band and the ionization potential of the 
donor. Inversely, under the same conditions, if we have a constant donor 
and vary the acceptors, a linear relation should exist between the frequency 
and the electron affinities of the acceptors. In practice, it is the first of these 
correlations which is most frequently observed. 


II. Biochemical Charge Transfer Complexes 

During the last few years, numerous authors have postulated the frequent 
formation of charge transfer complexes between molecules of biochemical 
interest and in particular conjugated biomolecules (for which relatively low 
ionization potentials and high electron affinities are expected) and have 
envisaged the involvement of such complexes both in the mechanism of 
biochemical reactions and in the structure of certain cellular components 
(nucleic acids, mitochondria, quantosomes). Szent-Gyorgyi (1960), in particu¬ 
lar, has been one of the protagonists of this conception. 

Among the biomolecules which have most frequently been considered as 
possibly implicated in charge transfer complexes are the following (Pullman 
and Pullman, 1963): 
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(1) the essential components of the oxidation-reduction coenzymes (if not 
the coenzymes themselves), in particular the pyridinium (or nicotin¬ 
amide) ring of the pyridine nucleotides and the isoalloxazine ring of 
the flavin coenzymes; 

(2) the purines; 

(3> indolic compounds, in particular tryptophan and serotonine; 

(4) quinones. 

The associations which have been most extensively studied and considered 
as charge transfer complexes or at least as involving charge transfer as an 
important component in the overall binding concern the interactions: 

(1) between biomolecules containing the indole ring and pyridine nucleo¬ 
tides or flavins, 

(2) in or between the oxidation-reduction coenzymes, 

(3) between purines and a series of partners such as flavins, aromatic 
hydrocarbons, steroids, actinomycin, acridines, purines themselves, etc. 

The analysis of the principal available data involving such biomolecules 
and related model compounds indicates (Table 1) that the formation of a 

TABLE 1 

Principal Researches on Charge Transfer Complexes in Biochemistry 


Complex Reference CTC band 


Pyridinium salts with iodine 

(Kosower, 1960, p. 171) 

+ 

Indolic compounds and NAD + 

(Cilento and Giusti, 1959; Cilento 
and Tedeschi, 1961) 

+ 


(Alivisatos et a/., 1961) 

+ 

Indoles and flavins 

(Isenbergand Szent-Gyorgyi, 1958, 

+ 


1959; Isenberg etal., 1960) 

a 

Methyl indoles with quinones 

(Foster and Hanson, 1964) 

-j- 

Intramolecular complex in indolyl 
ethylnicotinamide 

(Shifrin, 1964a) 

+ 

Intermolecular complexes in a model 
(/, in text) for interactions of aro¬ 
matic amino acids with the nicotin¬ 
amide moiety of NAD + 

(Shifrin, 1964b) 

+ 

Tryptophan and pteridines 

(Fujimori, 1958) 

— 

Indoles and related heterocycles 
with T 2 bacteriophage 

(Kanner and Kozloff, 1964) 

— 

Pyridinium cations and aromatic 
hydrocarbons 

(Cilento and Sanioto, 1965) 

+ 

Menadione and aromatic hydro¬ 
carbons 

(Cilento and Sanioto, 1963) 

+ 

Purines or pyrimidines with chlor- 
anil or 1.3.5.-trinitrobenzene 

(Beukers and Szent-Gyorgyi, 1962) 

+ 

Nucleic acid bases and chloranil 

(Machmer and Duchesne, 1965) 

+ 
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Complex 


Reference CTC band 


(LuValle et al., 1963) 

(Duchesne et al., 1965) + 


Indole, thymine, and cytosine with 
chloranil 

Nucleosides or nucleotides of the 
nucleic acid bases and chloranil 
Purine and pyrimidine nucleotides 
with mutagenic acridines 
Mutagenic acridines and tetracyano- 
ethylene 

Purines and pyrimidines with ste¬ 
roids 

Aromatic amino acids and purines 
with isoalloxazine derivatives 
Interaction of purines and pyrimi¬ 
dines with flavins 

Intramolecular complex between 
adenine and isoalloxazine in FAD 
Pyridinium ring and adenine in 
NADH 2 (intramolecular) 
Pyridinium ring and adenine in 
NAD + (intramolecular) 

FMN with FMNH 2 
NADH and FMN 

NAD-NADH 

Crystals of 8-azaguanine mono¬ 
hydrate 

Phenols and flavins 

Carcinogenic hydrocarbons and 

iodine 

Carcinogenic hydrocarbons and 
acridine 

Phenothiazines and metals 
Amiuoacids and proteins with ribo 
flavin, choranil and oxygen 

Aromatic carcinogens with iodine, 
chloranil, trinitrobenzene and acri¬ 
dine 

^-carotene with iodine 
Nucleic acid bases with chloranil, 
iodine and riboflavin 
Porphyrins with heterocyclic mol¬ 
ecules, including purines 


(Duchesne and Machmer, 1965) + 

(Duchesne and Machmer, 1965) + 


(Gibson et al ., 1962) — 

(Isenberg et al ., 1961) + 

(Szent-Gyorgyi etal ., 1961) 

(Cilento and Schreier, 1964) 

(Macintyre et al ., 1965; — 

Macintyre, 1965) 

(Fleischman and Tollin, 1965a,b) + 

(Szent-Gyorgyi et al., 1960) + 

(Szent-Gyorgyi and McLaughlin 1961) + 

(Borg, 1961) 

(Slifkin, 1962; Birks and Slifkin, 

1963) 

(Slifkin, 1964) 

(Slifkin, 1963) 

(Epstein et al., 1964) 


(Lupinski, 1962) 
(Slifkin, 1965) 

(Mauzerall, 1965) 


(Molinari and Lata, 1962) 

(Harbury and Foley, 1958) 
(Harbury et al., 1959) 
(Tsibris et al., 1964) 

(Weber, 1950) 

(Weber, 1957) 

(Cilento and Schreier, 1964) 


a Electron spin resonance studies. 
b No complexation. 


+ -H + + + 
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charge transfer complex is firmly established in only a limited number of 
cases, in so far as the existence of a characteristic charge transfer band has 
been observed only rarely. In fact, such a band seems to occur essentially 
when one of the partners in the complex consists of a pyridinium or a quinone 
ring which, as will be seen shortly, are strong electron acceptors. In most of 
the other cases what is generally observed is a more or less satisfactory 
correlation between the association constants and the electron donor or 
acceptor properties of the molecules involved. The correlation is then con¬ 
sidered as indicative of a possible involvement of charge tranfer as a significant 
component in the forces governing the formation of the complex. 

III. Electron Donor and Acceptor Properties of Biomolecules 

This situation gives rise to a number of problems. In the first place it is 
obvious that the knowledge of the ionization potentials and electron affinities 
of biomolecules, which measure respectively their electron donor and ac¬ 
ceptor abilities, is of fundamental significance for the appreciation of charge 
transfer complexations. In view of the complete absence of experimental 
information about the values of these quantities in biomolecules, the contri¬ 
bution of the theory, even if suggesting only approximate values, is then 
obviously of particular importance. 

The simplest evaluation of these properties may be obtained through 
the use of the Htickel approximation of the molecular orbital method, 
it being understood that such an evaluation is particularly suitable for the 
determination of the relative electron-donor-acceptor properties of the mol¬ 
ecules. The appropriate indices are the energies of the highest filled 
molecular orbitals for the electron donor capacity and the energies of the 
lowest empty molecular orbitals for the electron acceptor abilities. The 
calculations yield these energies in the forms £) = a + Kf, where a is the 
coulomb and /? the resonance integral of the method. The values of 
are generally in the range of 0 to 1.5 for the highest filled molecular orbital 
and of 0 to -1.5 for the lowest empty molecular orbital. The closer to zero the 
values of the coefficients for both orbitals the greater respectively the electron 
donor or the electron acceptor properties of the molecules (Pullman and 
Pullman, 1963). The reliability of these quantities for conclusions in this 
field is substantiated by correlations obtained between the theoretical and 
experimental sets of data in series of fundamental molecules in which both 
are known (Streitwieser, 1961). The possession of such a relative scale is 
frequently sufficient for the elucidation of the nature of the partnership in 
the complex and the verification of the aforementioned correlations. The 
essential results in this field (Pullman and Pullman, 1963) are summed up in 
Table 2 from the examination of which it appears that, broadly speaking, 
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TABLE 2 

Energy Coefficients of Molecular Orbitals (in units) 


Compound Highest filled Lowest empty 

molecular orbital molecular orbital 


Purine 

Adenine 

Guanine 

Hypoxanthine 

Xanthine 

Uric Acid 

Uracil 
Thymine 
Cytosine 
Barbituric acid 
Alloxane 

Phenylallanine 

Tyrosine 

Histidine 

Tryptophan 

Riboflavin 

Pteridine 

2-Amino-4-hydroxypteridine 

2.4- Diaminopteridine 

2.4- Dihydroxypteridine 
Folic acid 

Porphin 

1.3- Divinylporphin 
l-Vinyl-5-formylporphin 

a-Carotene 
/3-Carotene 
Vitamin Ai 
Vitamin A 2 
Retinene 

/>-Benzoquinone 

1.4- Naphtoquinone 
9,10-Anthraquinone 

Benzohydroquinone 

Naphtohydroquinone 

Anthrahydroquinone 

NAD + 

NADH 

FMN 

FMNH 2 


0.69 

-0.74 

0.49 

-0.87 

0.31 

-1.05 

0.40 

-0.88 

0.44 

-1.01 

0.17 

-1.19 

0.60 

-0.96 

0.51 

-0.96 

0.60 

-0.80 

1.03 

-1.30 

1.03 

-0.76 

0.91 

-0.99 

0.79 

-1.00 

0.66 

-1.16 

0.53 

-0.86 

0.50 

-0.34 

0.86 

-0.39 

0.49 

-0.65 

0.54 

-0.51 

0.65 

-0.66 

0.53 

-0.65 

0.30 

-0.24 

0.29 

-0.23 

0.30 

-0.21 

0.10 

-0.19 

0.08 

-0.18 

0.23 

-0.31 

0.20 

-0.26 

0.28 

-0.26 

1 

-0.23 

1 

-0.33 

1 

-0.44 

0.63 

-1 

0.41 

-0.71 

0.23 

-0.53 

1.03 

-0.36 

0.30 

-0.92 

0.50 

-0.34 

-0.11 

-0.95 
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the biomolecules may be divided from the point of view under consideration 
into three groups: 

(1) Compounds susceptible to function essentially as electron donors. 
These include the purines (moderate electron donors with the exception of 
uric acid which is predicted to be a very good electron donor), pyrimidines 
(very poor donors, with some of them, e.g. alloxane, even predicted to be 
rather acceptors), a-aminoacids of proteins (poor donors with the exception 
of tryptophane which should be a moderate donor), reduced forms of flavins 
and of pyridine nucleotides (good donors) and some dyes of pharmacological 
interest, in particular in the series of phenothiazines (very good donors). 

(2) Compounds susceptible to function essentially as electron acceptors. 
These include the oxidized forms of flavins and of pyridine nucleotides, 
some (but not all) pteridines, quinones, and bile pigments. 

(3) Compounds susceptible to function as both electron donors and ac¬ 
ceptors, and generally, as both good donors and acceptors. These are essen¬ 
tially the porphyrins, carotenes, and retinenes. 

At this point it must usefully be added that in some although still rare 
cases more refined, self-consistent field calculations have been carried out 
with the aim of obtaining more precise, absolute values of the ionization 
potentials of some of the electron donors, quoted above (reliable calculations 
of electron affinities are, as is well known, very difficult to perform). This is in 
particular the case of purines and pyrimidines present in the nucleic acids 
(Pullman and Rossi, 1964). The refined calculations carried out for these 
particular important biomolecules confirmed the trend predicted by the Hiickel 
method and in particular confirmed that guanine should be the best electron 
donor among the bases of the nucleic acids. They also confirm the approximate 
validity of the Wacks-Dibeler (1959) equation: 

/ = (3.14 ± 0.24)*, + (6.24 ± 0.10), 

which proposes to relate the ionization potentials / (in electron volts) to the 
coefficients K t of the highest occupied molecular orbital. 

The theoretical predictions have also received a number of striking experi¬ 
mental, although indirect, verifications. This is again, in particular, the case 
of purines and pyrimidines, and the verifications come essentially from the 
studies of the electrochemical behavior of these compounds. Table 3 sum¬ 
marizes the results of researches on polarographic oxidability and reducibility 
of the bases (Smith and Elving, 1962; Struck and Elving, 1964; Elving et al., 
1966; Pullman, 1965) and it can be observed that the results are in good 
agreement with predictions based on the values of the coefficients of the mo¬ 
lecular orbitals. Thus, the oxidizable compounds have lower values of the 
coefficient of their highest filled orbital than the nonoxidizable ones and, 
similarly, the reducible compounds have smaller absolute values of the 
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TABLE 3 

Electron Donor and Acceptor Properties of Purines and Pyrimidines 


Compound 

Highest 
filled MO 

Polarographic 

oxidability 

Lowest 
empty MO 

Polarographic 

reducibility 

Purine 

0.69 

— 

-0.74 

+ 

Adenine 

0.49 

+ 

-0.87 

+ 

Guanine 

0.30 

+ 

-1.05 

— 

Hypoxanthine 

0.40 

+ 

-0.88 

+ 

Xanthine 

0.44 

+ 

-1.01 

— 

Uric acid 

0.17 

+ 

-1.19 

— 

Uracil 

0.60 

— 

-0.96 

— 

Thymine 

0.51 

— 

-0.96 

— 

Cytosine 

0.60 

— 

-0.80 

+ 

Barbituric acid 

1.03 


-1.30 


Alloxane 

1.03 


-0.76 

+ 


coefficients of their lowest empty orbital than do the nonreducible ones. 
It may be added that in striking agreement with theory uric acid is the most 
easily oxidized of all the compounds tested. The polarographic results con¬ 
firm also the prediction that guanine is the most easily oxidized compound 
among the bases of the nucleic acids. 

The good electron donor properties predicted for porphyrins and carotenes 
are in agreement with the relatively low value of their ionnization potentials 
(Terenin and Vilessov, 1964) and so is also the case for the very good donor 
properties predicted for phenothiazine (Kearns and Calvin, 1961, Lyons and 
Mackie, 1963). 

The abilities of the highly conjugated carotenes and retinenes to function 
also as electron acceptors is substantiated by their easy polarographic reduci- 
bility (Kuta, 1964). 


IV. Established Charge Transfer Complexes 

The results of Table 2 may be used in the study of charge transfer complex- 
ations in two ways. In the case of well-defined complexes, characterized by 
the existence of a charge transfer band, they may lend further support to the 
general theory by showing the existence of the expected correlations between 
the frequency of the band and, following the case, the ionization potentials 
of the donors (when the acceptor is constant) or the electron affinities of the 
acceptors (when the donor is constant). The following are particularly 
striking examples of such successful correlations. 
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(1) The work of Shifrin (1964a,b) on the model compounds (for the study of 
enzyme coenzyme interactions) of the type I, where the symbol AAC repre- 



I 


sents the conjugated ring of the aromatic amino acid of proteins and in which 
a correlation appears (Fig. 1) between the frequency of the charge transfer 
band and the theoretical ionization potential of the aromatic donor as 
measured by the coefficient of its highest occupied molecular orbital. The 



Fig. 1. Wavelength of the charge transfer band vs energy of the highest occupied 
molecular orbital in compounds of Type I. 

work confirms the relatively important electron-donor properties of the 
indole ring. The point representative of imidazole lies, however, somewhat 
off the curve. 

(2) The work of Machmer and Duchesne (1965) on the charge transfer 
complexes between nucleic acid bases and chloranil, in which a linear correla¬ 
tion can be shown to exist (Fig. 2) between the peak of the charge transfer 
band and the ionization potentials of the bases. The work confirms the 
relatively important electron-donor properties, among the nucleic bases, of 
guanine. 
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Fig. 2. The wavelength of the charge transfer band vs the ionization potential of the 
bases in complexes of nucleic acid bases with chloranil. 

(3) The work of Kosower (1960) on the model complexes between pyri- 
dinium compounds and iodine, where it may be shown (Fig. 3) that the wave¬ 
length of the charge transfer band is related linearly to the electron affinity 
of the pyridinium compounds, as measured by the coefficients of their lowest 
empty molecular orbitals. 



Fig. 3. The wavelength of the charge-transfer absorption band vs the energy of 
the lowest empty molecular orbital in complexes of methylpyridinium compounds with 
iodine. 
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V. Dubious (Partial?) Charge Transfer Complexations 

On the other hand, in a number of cases in which molecular associations 
were found but in which no charge transfer band has been observed, correla¬ 
tions have nevertheless been noticed between the association constants and 
the theoretical data on the electron donor or acceptor properties of the 
molecules. This is the case, in particular, for a number of associations 
involving purines and pyrimidines as electron donors with a constant electron 
acceptor such as e.g. riboflavin(Tsibris et al., 1965), the antibioticactinomycin 
(Pullman, 1964a), or the carcinogenic 3,4-benzpyrene (Pullman, 1964b). In 
such cases, the correlation was considered as signifying that charge transfer 
forces play a significant role in these associations. The two essential questions 
which may be raised, of course, are: how signficant? and, what for? Recent 
general results on the evaluation of the intermolecular forces responsible for 
the “stacking” type interactions between conjugated aromatic molecules at 
relatively short distances (to which the above quoted associations probably 
belong) seem to indicate that in spite of the observed correlations the contri¬ 
bution of charge transfer to the binding energy of such complexes may in 
fact be rather restricted and represent only a relatively small percentage of 
the over-all interaction energy. 

Thus, e.g., calculations indicate (Mantione and Pullman, 1966) that the 
contribution of the charge transfer forces to the energy of the stacking-type 
auto-association of the purines and pyrimidines in aqueous solution or to 
the association of purines with hydrocarbons is relatively small, amounting 
only to a fraction of the contribution evaluated for the Van der Waals- 
London forces (Pullman et al., 1965a,b). On the other hand this situation 
does not preclude the possible importance of the effect on some other 
physicochemical properties of the complexes, as suggested e.g. by the recent 
proposition of Macintyre (1965; Macintyre et al., 1965) following which the 
shortening of the interplane distance in the crystals of 8-azaguanine with 
respect to the usual separation between the planes of aromatic molecules 
could be due to charge transfer interactions. Such interactions operating 
inside the nucleic acids could also be related to the semiconductivity of these 
macromolecules (Brillouin, 1962; Pullman, 1964c). As to the direct biological 
significance of such complexes not much seems to be known about it at 
present and in fact it may be useful to reemphasize the extreme danger of 
postulating without valid arguments their decisive involvement in biological 
phenomena. Thus, e.g., a number of authors have postulated in recent years 
the involvement of electron transfer phenomena in the mechanism of carcino¬ 
genesis by aromatic hydrocarbons or heterocyclics. These propositions were 
all based on some limited correlations between the electron donor or acceptor 
properties of some selected chemicals and their carcinogenicity. It was very 
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easy to show simply by enlarging the number of chemicals considered that the 
proposed correlations were fictitious (Pullman, 1964b). Explicit recent experi¬ 
mentation (Epstein et al., 1964) on the ability of the chemical carcinogens to 
take part in charge transfer complexes with a number of acceptors (iodine, 
chloranil, trinitrobenzene, acridine) has entirely confirmed the soundness of 
the critical approach and demonstrated the absence of any correlation between 
carcinogenicity and charge transfer complexation. 


VI. n-n and “Local” Charge Transfer Complexations 

In the preceding discussion we have been essentially concerned with the 
electron donating and accepting properties and thus in the possible involve¬ 
ment in charge transfer complexes of the n electrons of biomolecules. The 
majority of such molecules contain, however, heteroatoms with lone pairs 
of electrons (n electrons) which may also be implicated in “local” charge 
transfer complexes. Refined, self-consistent field calculations, in so far as 
they are available, show that in the large conjugated biomolecules of the type 
that we are interested in, the lowest ionization potential is probably generally 
that of the n electrons. This is in particular the case for biological purines 
and pyrimidines (Pullman and Rossi, 1964). Nevertheless, the lone-pairs 
may be predicted to be of more importance in charge transfer complexes 
formed with saturated biomolecules and in fact the involvement of the lone 
pairs in charge transfer complexes has been postulated to occur in the inter¬ 
actions of the a-aminoacids of proteins with different electron acceptors 
(Slifkin, 1962, 1963; Birks and Slifkin, 1963). 

The concept of a “local” charge transfer complex, involving electron 
transfer through a localized site at the molecular periphery has also been 
advocated in the case of n-n complexes, in particular of those involving indolic 
compounds (Szent-Gyorgyi and Isenberg, 1960), whose complexing ability 
seems to be superior to what might be expected from the electron donor 
properties of such molecules, as evaluated by the energy of their highest 
filled molecular orbital. Another interpretation of this situation has, however, 
also been proposed in terms of the particularly outstanding complementary 
charge distribution in the linked partners (Karreman, 1961, 1962). 

Note added in proof: The list of the principal researches on charge transfer 
complexes in biochemistry given in Table 1 may be usefully completed with 
the following recent references: (a) complexes between phenothiazines and 
acceptors (Foster and Hanson, 1966; Foster and Fyfe, 1966) in which a CTC 
band is observed; (b) complexes between oxidized flavins and indole deriva¬ 
tives and between both oxidized and reduced flavins and purines (Wilson, 
1966), a new absorption band being observed only in the case of the complexes 
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involving indoles; (c) complexes between NAD + analogs and reduced flavin 
mononucleotide (Sakurai and Hosoya, 1966), showing a CTC band, the fre¬ 
quency of which correlates linearly with the energies of the lowest empty 
molecular orbitals of the NAD + analogs. 
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I. Introduction 

The augmented plane wave (APW) method, originally proposed by Slater 
(1937) and extended by Slater and Saffren (1953), has in recent years become 
the leading method for determining theoretically electronic energy-band 
structures, particularly in metals. The success of the method lies in its ease of 
application (thanks to the computer techniques developed by Saffren (1959) 
and Wood (1960), and in the fact that highly accurate solutions of the periodic 

* Supported by the U.S. Air Force Office of Scientific Research, 
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t Supported by the U.S. Atomic Energy Commission. 
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potential problem (for a given potential) may be obtained. Well-known 
applications of the APW method to the determination of the energy-band 
structure of metals include those of Saffren, Wood, Burdick, Hanus, and 
Switendick. [An excellent description of the APW method, its relation to other 
energy-band methods and an extensive review of the literature is given by 
Slater (1965)]. Loucks’ (1965) recently completed relativistic version of the 
APW scheme has greatly increased the applicability of the method, particularly 
to those systems where spin-orbit and/or other relativistic terms are important. 

In this paper we report on some calculations we have performed, using the 
nonrelativistic APW method, to obtain the electronic energy bands of the 
heavy rare-earth metals. This work was undertaken as the first part of a 
program to investigate, from a theoretical viewpoint, the electronic and 
magnetic properties of the type 4f hexagonal rare-earth metals. We will be 
concerned here primarily with the results of these calculations and with their 
immediate implications in terms of the electric, optical, and magnetic properties 
of these materials. In addition, we discuss briefly the problems involved in 
calculating from first principles the electronic energy bands in crystals in 
general and also specifically in metals. We also discuss the assumptions and 
approximations peculiar to the APW method as used in current calculations 
and specifically in the calculations we have performed. Much of what we dis¬ 
cuss is well known, but is included here because of its particular relevance to 
this volume and to viewing the results we shall report. Finally, we discuss the 
results obtained for the heavy rare-earth metals and our expectations as to 
their general validity. 


II. The Heavy Rare-Metals 

The heavy rare-earth metals have been viewed traditionally as consisting 
of trivalent atomic cores including an unfilled or partially filled 4f shell plus 
three conduction electrons per atom. Due to the lack of a more detailed 
model of the conduction bands in rare-earth metals, previous theoretical work 
has attempted to explain the available experimental data by assuming a simple 
model in which the three conduction electrons occupy essentially free electron 
bands perturbed perhaps by a fairly small crystal potential. Much of this 
theoretical work appears to depend critically on details of the free electron 
model for the conduction electrons. We undertook to calculate these energy 
bands in hopes of obtaining a more precise model for the rare-earth metals. 

The principal experimental properties of the heavy rare-earth metals in 
which we are interested fall into three categories—magnetic properties, 
optical properties, and electric properties. Owing to the partially filled 4f 
shell, which in all of the heavy rare-earth metals possesses a localized magnetic 
moment, these materials order at low temperatures to form various magnetic 
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structures. Some of these structures are shown (Koehler, 1965) in Fig. 1. As 
indicated in the figure, the heavy rare-earth metals possess various magnetic 
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Fig. 1. Schematic representation of magnetic structures of rare-earth metals after 
Koehler (1965). The moments are supposed to be parallel in a given hexagonal layer. 


configurations, and transitions between these configurations occur for any 
given material at different temperatures. In Table 1, we show some additional 


TABLE 1 


Some Physical Properties of the Heavy Rare-Earth Metals 0 


Metal 

T N (°K) 

T c (°K) 

h-Ah-b) 

gJ(}J. B ) 

Gd 


293.2 

7.55 

7.0 

Tb 

229 

221 

9.34 

9.0 

Dy 

178.5 

85 

10.20 

10.0 

Ho 

132 

20 

10.34 

10.0 

Er 

85 

19.6 

8.0 

9.0 

Tm 

51-60 

22(?) 

3.4 

7.0 


a As listed by Koehler (1965). 


physical properties of the heavy rare-earth metals. Note in particular the 
measured magnetization per atom as indicated in the fourth column. The 
expected moment due to the 4f electrons alone is shown in the last column. 
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In all cases the numbers in these two columns are somewhat different, 
indicating a contribution from the polarization of the conduction band elec¬ 
trons due to an exchange with the 4f shell is expected. 

The principal electrical property which we would like to understand at 
present is the resistivity of these materials measured as a function of tempera¬ 
ture. This shows effects due to two causes. First, there is evidence of spin 
disorder scattering in the strong temperature dependence of the conductivity. 
Secondly, large discontinuities occur due to the various magnetic transitions 
which these materials undergo (Hall et al., 1958; Colvin et al., 1960; Hegland 
et al., 1963). Measurements of the resistivity of single crystals, e.g., erbium, 
show a strong anisotropy. The discontinuities are present primarily in the 
resistivity measured along the hexagonal or c-axis of the crystal. This anomaly 
has been interpreted in terms of superzone gaps introduced perpendicular 
to the c-axis at the transition temperature by the onset of long-range magnetic 
ordering (Mackintosh, 1962; Miwa, 1963; Elliott and Wedgwood, 1963). 


III. Approximations Involved in So-Called First Principle Band Calculations 

Before discussing our specific results for the rare-earth metals, it is useful 
to discuss briefly the assumptions and approximations inherent in energy- 
band calculations. The calculation of electronic eigenstates in crystal and 
solid is essentially a many-body problem and entails the solution of Schrd- 
dinger’s equation for approximately 10 23 nuclei and electrons. Obviously, this 
is a completely hopeless task without the addition of numerous simplifying 
approximations. 

A. Reduction of many-body problem to one-electron form 

The first set of approximations are assumed in order to reduce the many- 
body problem to that of a single electron in a periodic potential. These 
approximations are serious and are not completely justifiable. The first of 
these is the Born-Oppenheimer approximation which essentially amounts to 
neglecting the electron-phonon interaction and reduces the problem to that 
of an interacting electron system only. Actually, electron-phonon interactions 
in metals can cause an enhancement of the measured electron mass and the 
measured oscillator strength for optical transitions by a factor of up to 2.5 
for some polyvalent metals. These effects are generally classified as polaron 
effects. The neglect of the electron-phonon interaction does not appear to be 
so bad for alkali metals as it is for polyvalent metals and in general it does not 
appear to seriously affect Fermi surface dimensions even though it modifies 
the density of states obtained from specific-heat measurements. Effects due 
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to electron-phonon interactions could conceivably be much more serious in 
insulators where conductivity occurs via a hopping process. 

The second assumption is the use of the Hartree-Fock (HF) approximation 
which reduces the problem to that of an independent electron model and which 
neglects electron-electron correlations among electrons of opposite spin. 
Even with this assumption, the problem is still a many-body problem and its 
exact solution is beyond our present capabilities. It is uncertain at this time 
if electron-electron correlations play an important role in the electronic 
band structure of metals. There has, however, been some recent speculation 
that such effects are important in describing the optical properties of some 
alkali metals. Electron-electron correlations are quite important in semi¬ 
conductors and insulators and lead, for example, to the well-known exciton 
effects in these materials. 

The third approximation is that of averaging the exchange term which 
arises in the Hartree-Fock equations. In order to obtain an effective single 
electron local potential, it is necessary to average this nonlocal term in one 
manner or another. It can be averaged over atomic orbitals leading to an { 
dependent exchange term. However, the more common approach is to use 
Slater’s p 1/3 approximation for the free electron gas. Serious questions have 
been raised as to the applicability of this approximation to the core (including 
rare earth 4f) electrons in metals. The principal justification of this approach 
is that it is simple and relatively easy to use. 

B. The APW “muffin tin” potential 

With these three approximations—Born-Oppenheimer, Hartree-Fock, and 
the Slater average of the exchange term—the many-body problem is reduced 
to that of a single electron in a periodic potential. The form of this potential 
in the APW method is that of a “ muffin tin,” where the potential is made 
spherically symmetric about each atomic site and is taken to be flat between 
the APW spheres. The potential is constructed by taking a superposition of 
spherically averaged atomic charge densities from the neighboring atoms 
using Lowdin’s expansion techniques. The atomic charge densities are obtained 
from the appropriate atomic wave functions calculated for the free atoms or 
ions. The potential between the muffin tin spheres is flat and is usually taken 
as an average of the potential over this region as obtained from the superposition. 
This potential contains a number of adjustable parameters. In the case of 
compounds, one can vary the assumed ionicity of the various components and 
their Madelung energies, since both of these quantities are usually known 
only within broad limits. In obtaining the atomic charge densities, one can 
vary the atomic configuration and the state of ionization assumed in the free 
atom calculations. One can also vary the radii of the APW muffin tins and the 
potential between the spheres. 
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C. Critique of procedure 

It is now important to ask if there is any justification of the numerous 
assumptions and approximations which we have described. The answer is that 
there is at present no real justification of the first three approximations made 
to reduce the many-body problem to that of a single electron in an effective 
periodic potential, other than a (dangerous) comparison of the final results of 
the calculations with experiment. Solutions of the periodic potential problem 
using an effective one-electron potential can actually be obtained to almost 
any accuracy desired with the use of modern-day computers. The only real 
problem which remains is that of obtaining a good self-consistent effective 
starting potential and of investigating the assumptions made in order to 
reduce the many-body problem to that of a single electron in a periodic 
potential. 

One approach might be to attempt to achieve self-consistency in the solu¬ 
tions by an iterative process. Here, one can use the calculated wave functions 
for the crystal to compute a new effective one-electron potential and iterate 
this process until convergence is achieved. This process, however, is costly 
in computer time and is still limited by the first three assumptions (mostly that 
of the averaged exchange) and also by the form assumed for the APWpotential. 
The conclusion is that this procedure may not be worth the trouble, primarily 
because the answer obtained is probably no better than that arrived at in the 
first calculation. 

Perhaps a more reasonable approach is to systematically vary the several 
adjustable parameters which enter the APW potential and to investigate the 
sensitivity of the calculated results to this variation. The range over which the 
results vary can, in some sense, be considered as a limit to their theoretical 
validity. One must recognize, however, that this entails only an investigation 
of the assumed single-electron effective potential and does not really investi¬ 
gate the assumptions necessary to obtain this form for the problem. In the 
case of compounds, one must vary the ionicity and Madelung energy within 
a reasonable range about the estimated values of these quantities. One must 
also investigate the effects due to a variation of the atomic configurations 
and state of ionization to the extent that these are not known for the solid. 

The radii of the muffin tins are chosen to be those values such that the 
APW spheres for the various constituent atoms touch. The potential between 
the spheres is usually taken, at least for metals, as some suitable average over 
the potential in this region obtained from the superposition of the atomic 
charge densities. Some results of this procedure carried out for a few materials 
to date can be summarized as follows: for nearly free electron metals for which 
the energy bands are least sensitive to a variation of the crystal potential, 
one finds, within the single-electron approximation, that the electronic-band 
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energies are probably good to about 0.2 eV. For transition metals the situation 
is not nearly so encouraging. Band-energy variations of between 1 and 5 eV 
have been found to occur using different starting potentials. 

The situation in semiconductors is somewhat more difficult to assay since 
considerable experimental data have been available, usually prior to the 
theoretical calculations, and in some cases the calculations appear to have 
been influenced by the experimental results. (Some theoretical procedures are 
semi-empirical in that the calculations are adjusted to fit existing data. This 
can be done with the APW method by selectively adjusting the various pa¬ 
rameters when there is sufficient experimental data available. This procedure 
probably results in the most accurate energy band for semiconductors. The 
pseudopotential method, which is probably the best known of these procedures, 
appears to have had considerable success in determining energy bands both 
in semiconductors and in metals. However, we are not considering this 
procedure here, but instead are concerning ourselves only with so-called 
first principle calculations. One can question, however, the use of the term 
“ first principle ” in any band calculation which includes the numerous assump¬ 
tions and approximations considered above.) For group IV semiconductors, it 
appears that the energy bands can be calculated to within about 1 eV without 
adjustment to fit experimental values for band gaps and other properties. 
The situation is expected to be somewhat worse in the case of the compound 
semiconductors. 

For the rare-earth metals in the present calculations, we find that the rela¬ 
tive band energies can vary as much as 0.5 eV with different potentials. How¬ 
ever, as is also true for the other classes of materials, some band gaps are 
much more sensitive than others. The variations quoted here are, in general, 
for the more sensitive bands. Nevertheless, one should keep these numbers in 
mind when assessing the validity of various energy-band calculations. 


IV. Electronic Band Structures of Rare-Earth Metals 

With this introduction and precaution, we are now in a position to discuss 
the results of the current energy-band calculations which we have obtained 
for the rare-earth metals. We discuss here primarily the results obtained for 
Gd since they are the most complete. The results which we have obtained also 
for La, Tm, and Lu are substantially the same as those of Gd and differ only 
in detail. 

First, consider the atomic structure of the rare-earth metals. These consist 
of a xenon core, a partially filled 4f shell (which in Gd contains seven 4f 
electrons) and three valence electrons in the atomic 5d and 6s states. The 
relative positions in energy of these states (Herman and Skillman, 1963) 
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are shown in Fig. 2. The atomic 5d and 6s states lie close to one another in 
energy and are widely separated from the 4f shell. The xenon core states all 
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Fig. 2. Relative positions of the energy of the outermost electrons in atomic Gd. 

lie at considerably lower energies and will not be further considered. From the 
figure one expects the 5d and 6s electrons to contribute to the conduction 
processes independent of the 4f electrons. 

In Fig. 3 we show the relative outer radial extent of the atomic electrons 
for Gd and also indicate the Wigner-Seitz radius appropriate for Gd metal. 
As can be seen, the 4f electrons are tightly bound to the atom and do not 
overlap neighboring atoms appreciably. Consequently in the band model, 
the 4f electrons will form a very narrow band. The calculations yield a 4f 
band with a width of about 0.05 eV located approximately 11 eV below the 
bottom of the 5d-6s bands. This separation, however, is very sensitive to the 
potential used in the calculations and is consequently unreliable. Furthermore, 
the energy-band picture for the 4f states essentially neglects the intra-atomic 
exchange energies of these electrons which amounts in Gd to several electron 
volts. Consequently, the 4f electrons cannot be treated as band electrons but 
must be considered as localized and so do not fit within the band picture-at 
all. This makes their calculated position in energy even more unrealistic 
than is indicated by the variation obtained with different potentials. As can be 
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Fig. 3. Relative outer radial extent of the atomic 4f, 5s, 5p, 5d and 6s electrons in Gd. 
Also indicated is the Wigner-Seitz radius appropriate for Gd metal. 

seen from Fig. 3, the 5d and 6s atomic functions on different atom sites do 
overlap one another to a considerable extent. Consequently, they will form 
an s-d conduction band of considerable width. The contribution to this band 
from the 5d electrons is expected to be somewhat narrower than that of the 
6s electrons since the spatial overlap of the 5d functions is somewhat less 
than that of the 6s functions. Note that this is expected to bear a strong 
similarity to the situation found to exist in the case of transition metals. 

The calculations for Gd have been carried out with a number of starting 
potentials. These have all been generated by superpositions of atomic charge 
densities obtained for free Gd atoms in which we have varied the assumed 
atomic configuration for the free atoms. We have used atomic charge densities 
obtained by using the Hartree-Fock-Slater (HFS) calculational procedure 
of Herman and Skillman (1963) applied to the atomic Gd configurations 
4f 7 5d J 6s 2 , 4f 7 5d 2 6s 1 ,4f 7 5d 3 6s°, and 4f 8 5d° 6s 2 . We have also used the 
atomic Hartree-Fock charge density obtained by Freeman and Watson (1962) 
for the singly ionized Gd configuration 4f 7 6s 2 to which was added a 5d func¬ 
tion. A plot of these potentials is shown in Fig. 4. Note that as expected, the differ¬ 
ences in the potentials occur in the outer regions of the atom since we have 
varied the configuration of the outer electrons only. All of these potentials 
gave qualitatively the same results with a maximum variation in the relative 
energies of the conduction band states of about 0.5 eV. The results which we 
present are those obtained using the HFS 4f 7 5d J 6s 2 potential. These are 
representative of results obtained using the other potentials. 

A histogram of the calculated density of states for Gd metal is shown* in 

* A preliminary report of this work was given earlier by Dimmock and Freeman (1964). 
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Fig. 4. Plot of effective nuclear charge for potential 2 Z p (r) for the different configura¬ 
tions used in the calculations described in the text. 


Fig. 5. The density of states of the conduction bands was obtained by dividing 
the Brillouin zone into 192 identical hexagons, each characterized by the 
energies calculated at its center. We show also for comparison the density of 
states given by the free-electron model. The d-bands originating from the 
Gd 5d states contribute a high density of states in the vicinity of the Fermi 
energy with a width of about 0.5 Rydbergs (6 eV). This width is nearly the 



Fig. 5. A histogram representation of the density of states, in electrons per atom per 
rydberg. The parabolic curve is the prediction of the free-electron model. 
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same as that obtained by Wood (1960) for the 3d-bands in iron. Note that the 
Fermi energy lies at a peak in the density of states. The Fermi energy, for 
three electrons per atom, is E F = 0.25 measured from the bottom of the band 
as compared with a value of 0.54 Ry for the free-electron model. At the Fermi 
energy, the calculated density of states is large, N(E F ) = 1.8 electrons per 
atom per eV compared with the free electron value of 0.6 eV -1 . This is due to 
the fact that the electron bands in the vicinity of the Fermi surface are of 
mixed s-d character and are consequently much flatter than would be expected 
from a free-electron model. 

One can note in Fig. 5 a small tail at lower energies in the density of states 
due to the bottom of the 6s band. This s-band tail can be seen more clearly in 
Fig. 6 where we plot the energy bands calculated for Gd metal. The high 



Fig. 6. Calculated E(k ) curves for the conduction bands of gadolinium metal along the 
symmetry directions T-K-H-A. 

density of flat d-bands in the vicinity of the Fermi energy is very apparent in 
this figure. One should note that these are actually mixed s-d bands since the 
s-electrons contribute to the conduction band states throughout the entire 
region. Further, it should be emphasized at this time that the energy-band 
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calculations which we are reporting are those obtained using a nonrelativistic 
method. If relativistic effects were included, they would result in two modifica¬ 
tions to Fig. 6. The first is that there would be a relative downward shift of 
the s-bands with respect to the d-bands by about 0.4 eV corresponding to the 
relativistic shift of the 6s states in Gd with respect to the 5d states. In addition, 
spin orbit coupling will split the energy bands at T, K, H and A and inciden¬ 
tally throughout the ALH plane as well. This splitting is also about 0.4 eV. 

The electronic properties of the rare earth metals are determined largely 
by the shape, form, and topology of their Fermi surface. Figure 7 shows the 



Fig. 7. The complete Fermi surface for holes in Gd metal in the double-zone 
representation. 

complete Fermi surface for holes in gadolinium metal, in the double-zone 
scheme. As far as we can determine there are no additional pockets of holes or 
electrons. This Fermi surface permits open orbits both along the c-axis of the 
crystal and in the plane perpendicular to this axis. One should ask to what 
extent we consider this Fermi surface reliable. We feel that we have partially 
answered this question by repeating our calculations with various different Gd 
starting potentials. Since the qualitative features of the Fermi surface are 
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found not to change with the different potentials used, we feel that these 
qualitative features are reliable. It should be pointed out that this Fermi 
surface bears no resemblance at all to the Fermi surface of the free-electron 
model. Note that we have drawn the Fermi surface in the so-called double¬ 
zone scheme. It should be recognized that with the inclusion of relativistic 
effects, an energy gap, representing itself as a discontinuity in the Fermi surface 
will occur in the ALH plane of the figure. 

V. Comparison with Experiment 

One might ask, how well do the calculated results agree with experiment? 
As stated earlier, much of the previous theoretical work on rare-earths 
depended critically on the assumed free-electron nature of the conduction 
bands. There is now mounting evidence that the free-electron model is com¬ 
pletely incorrect and that our 5d-6s bands are indeed the correct representa¬ 
tion of the conduction bands, as we shall show. 

A. Magnetization and specific-heat measurements 

As regards experiment, it has been difficult, for example, to explain by 
means of the free-electron model the large saturation magnetization (Nigh 
et al., 1963) of Gd (cf. Table 1) and especially the large electronic specific heat 
(Berman et ah, 1958; Jennings et ah, 1960; Lounasmaa, 1963, 1964) of the 
rare-earth metals which indicates a density of states at the Fermi surface some 
eight times that given by the free-electron model. 

Let us first consider the magnetic properties of gadolinium (Dimmock and 
Freeman, 1964). The saturation magnetization measurements of 7.55 /< B per 
Gd atom in the metal is 0.55 /i B more than expected for an s S ion. It is common 
to assume that this additional moment arises from a polarization of the 
conduction electrons. Using our computed density of states and a simple 
model in which the conduction electrons are polarized by exchange with the 
localized 4f electrons, one may estimate the exchange integral / required to 
produce this additional moment. For gadolinium the induced moment is 
given by n = 7/2/A(£ f )/i b . For N(E F ) = 1.8 eV" 1 we find / = 0.08 eV. 
This / is about five times smaller than that computed between atomic 4f 
and 5d electrons but agrees with values of / (0.05-0.10 eV) calculated between 
a localized 4f electron and a plane wave. (Orthogonalized plane wave calcula¬ 
tions, on the other band give / = 0.04-0.07 eV.) From these estimates, we 
conclude that the “extra” magnetization in Gd can arise very easily from a 
reasonable exchange between the magnetic 4f electrons and the s-d conduction 
electrons at the Fermi energy largely because of the size of our computed 
N(E f ). Further, with this picture of s-d conduction electrons occupying a 
band having a high density of states, one sees a strong qualitative resemblance 
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to the transition metals and the role of d electrons in understanding the origin 
of magnetism in these materials. The difference is that in the rare-earth metals 
the bulk of the magnetization is carried by the 4f electrons which, however, 
lie well inside the atom and play no further direct role in interatomic exchange. 

From our calculated N(E F ) we obtain an electronic specific heat contribu¬ 
tion of y = 4.2 mJ/mole deg 2 , which may be compared with an average 
measured value of about 10 mJ/mole deg 2 for the 4f rare-earth metals with 
triply ionized cores, and with a free-electron value of 1.3 mJ/mole deg 2 . Thus 
while our calculated N(E F ) is some three times larger than the free-electron 
value, the calculated y is smaller than experiment by about a factor of two. 
Crude estimates for Gd indicate that this difference could arise from electron- 
phonon contributions (Krebs, 1963; Prange and Kadanoff, 1964) to an appar¬ 
ent N(E f ) deduced from measured y values. 

B. Optical properties 

Predictions can be made from the calculated energy bands concerning 
optical properties. Anomalies in the optical absorption and reflectivity may 
occur due to interband transitions between occupied and unoccupied energy 
levels for frequencies at which the bands yield a high joint density of states. 
These have been discussed elsewhere (Dimmock et al., 1965). In addition to 
these transitions, interband transitions can also occur between the atomic 4f 
levels and the electronic energy states above the Fermi energy. Blodgett et al. 
(1964) have recently measured the photoemission from Gd and have found a 
high density of states at the Fermi energy and a bandwidth in good agreement 
with our theoretical predictions. They have also observed emission from 4f 
levels located about 5.8 eV below the Fermi energy. Since our band calculations 
yield a high density of states at about 1.2 eV above the Fermi energy (see 
Fig. 5) we would expect the transitions between the 4f levels and the conduc¬ 
tion bands to yield an anomaly in the uv at about 7.0 eV. 

There are also optical anomalies which are expected from the fact that the 
conduction bands in the heavy rare-earth metals will be split at low tempera¬ 
tures through an exchange interaction with the 4f electrons (Miwa, 1963). 
As shown in Fig. 1, the heavy rare-earth metals exhibit, at low temperatures, 
some form of magnetic order. This results in an exchange splitting of the s-d 
conduction bands which may take the form either of energy gaps which occur 
at superzone boundaries created by an antiferromagnetic superlattice as 
exists in Tm, or a simple separation into the so-called spin-up and spin-down 
bands of a ferromagnet as would occur in Gd. Both of these types of exchange 
splittings are expected to result in optical absorption and reflectivity anomalies 
due to transitions between the exchange split bands. However, the shape and 
possibly the position of the absorption band might be expected to be somewhat 
dependent on the magnetic structure. 
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The infrared reflectivity of a thin film of antiferromagnetic Ho has been 
measured by Schuler (1964). His data show a temperature dependent anomaly 
at about 0.35 eV which has been interpreted as due to interband transitions 
across the energy gaps created by the antiferromagnetic superlattice. Cooper 
and Redington (1965) have observed the infrared absorption spectra of a 
thin film of Dy both with and without an external magnetic field. They observed 
an anomaly at about 0.44 eV quite similar to that seen by Schuler in Ho. The 
spectra observed in antiferromagnetic and ferromagnetic Dy were essentially 
identical. Recently, Schuler has observed a similar anomaly in the reflectivity 
of thin films of ferromagnetic Gd. The frequency of the anomalies in both 
Dy and Ho are strongly temperature dependent, becoming lower at higher 
temperatures. In addition, the anomalies vanish altogether above the Neel 
points. This indicates that they are probably of magnetic origin, due to the 
exchange splitting of the s-d conduction bands. The exchange splitting is 
given for heavy rare-earth metals at T = 0 by A E = 2 £S where £ is an effec¬ 
tive s-f exchange energy and S' is the spin of the rare-earth ion. We have 
estimated A E, as above, for Gd from our calculated density of states at the 
Fermi surface and the observed saturation magnetization of 7.55 // B /atom. 
We found that for Gd A E = 0.61 eV. Values of A E for the heavy rare-earth 
metals are given in Table 1 from the value for Gd andEq.(l) assuming £ is the 
same for all metals in the series. Although the values of A E given in Table 2 
are to be considered as rough estimates only, the agreement with the position 
of the absorption anomaly in Dy at about 0.44 eV and with the position of the 
reflectivity anomaly in Ho at about 0.35 eV is quite encouraging. 


TABLE 2 

Estimates of the Values of A E for the 
Heavy Rare-Earth Metals 



5 

A E (eV) 

Gd 

7/2 

0.61 

Tb 

3 

0.52 

Dy 

5/2 

0.44 

Ho 

2 

0.35 

Er 

3/2 

0.26 

Tm 

1 

0.17 


C. Electrical resistivity anomalies 

Finally, we discuss briefly the resistivity anomalies observed at the magnetic 
ordering temperatures of the heavy rare-earths (Colvin et ah, I960; Hall et 
ah, 1958; Hegland et ah, 1963; Elliott and Wedgwood, 1963; Mackintosh, 
1962; Miwa, 1963). As mentioned in Section II, these anomalies have been 
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successfully interpreted in terms of the magnetic ordering of these materials. As 
seen in Fig. 1, Tm shows a linear spin wave-type structure between 50° and 
40°K in which the magnitude of the z-component of the moments varies 
sinusoidally with distance along the c-axis. The period of the wave vector is 
constant at seven (hexagonal) layers over the observed temperature range and 
differs from that of many of the other rare-earth structures which have periods 
incommensurate with the lattice and which vary with temperature. As 
described briefly above, this periodic magnetic structure introduces planes of 
energy discontinuity (superzone gaps) into the nonmagnetic Brillouin zone 
structure (Elliott and Wedgwood, 1963; Mackintosh, 1962; Miwa, 1963). 
These energy gaps may affect drastically the electrical conduction of the 
metal as the temperature is lowered through the Neel temperature provided 
one of these superzone boundaries destroys a large part of the Fermi surface. 
The observation of large anisotropy in the resistivity (maximum along the 
c-axis) and the reduction of the anomaly in Tb when the sample is cooled in a 
magnetic field (which suppresses the long-range order and reduces the band 
gaps) provided confirmation for the validity of the model. The existence of 
superzone planes which cut large sections of the Fermi surface in the free- 
electron model led easily to an explanation of the resistivity anomaly along 
the c-axis and to the variation in resistivity with the variation in magnetic 
period. This agreement has since been taken as evidence in support of the 
validity of the free electron model. However, our determination that the 
conduction bands in Gd resemble those of the transition metals and differ 
markedly from free electron bands showed that this model was completely 
invalid for the rare-earth metals. It also raised the question at the same time 
as to whether agreement between theory and experiment could be restored 
using the APW band structure and the resulting Fermi surface. 

The usual argument, based on first-order perturbation theory, asserts that 
energy gaps and superzone boundaries occur at values of k for which 

£(k) - £(k ± q) = 0 (1) 

when the system is subjected to a perturbation of wave vector q. Experimen¬ 
tally, q lies parallel to the crystal c-axis in the rare-earth metals. Since by 
symmetry £(k) = E( — k) this results in planar superzone boundaries at k z = 
±\q, or more completely at k z = \K Z ± \q where K, is the z component of a 
reciprocal lattice vector. It should be noted that although these are the only 
superzone boundaries required by symmetry, to this order, in general other 
boundaries may occur at values of k which satisfy Eq. (1) and that these values 
of k will depend specifically on the energy-band structure or, alternatively, 
on the particular Fermi surface. Specifically, gaps occur at those values of k 
where both k and k + q lie on the Fermi surface, that is, where q spans a 
section of the Fermi surface. 
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All of this discussion is limited to first-order perturbation theory, which 
should be applicable in the vicinity of the ordering temperature where the 
magnetic moment is small. However, at lower temperatures, if the periodic 
perturbation is large, compared to the separations between energy bands, a 
higher order treatment must be considered. Accepting the arguments given 
above then, from Table 2 we see that inTm at T = 0 the energy gaps introduced 
at the superzone boundaries should be about 0.17 eV wide. This energy is 
comparable to the separation between the flat d-bands in the vicinity of the 
Fermi surface which we have obtained for the rare-earth metals. So-called 
higher order terms in perturbation theory will introduce energy gaps of 
magnitude comparable to those at k 2 = +\q. Therefore, because of the fact 
that these bands are relatively flat and closely spaced we must replace Eq. (1) 
by 

£(k)-£(k±«q) = 0. (2) 


This then leads to superzone boundaries at k z = +\nq. 

Our calculated energy bands and Fermi surface for Tm closely resemble 
those for Gd shown in Figs. 6 and 7. Figure 8 shows several vertical cross 



Fig. 8. Some vertical cross sections of the Tm Fermi surface containing the c-axis. 
The influence of magnetic ordering is demonstrated by comparing the low temperature 
cross sections, shown as heavy solid curves, with the high temperature cross sections, shown 
as light solid curves. The horizontal lines denote superzone boundaries at k z = ±n(2n/7c) 
introduced by the magnetic ordering. 


sections of the Fermi surface calculated for Tm. The horizontal lines denote 
superzone boundaries at k z = ±n(2n/7c) introduced by the periodic magnetic 
ordering. The expected distortion of the Fermi surface at T = 0 is given 
approximately by the heavy solid curves. The largest portions of the Fermi 
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surface are destroyed by the superzone boundary corresponding to n = 3. 
Notice that this effect is a direct consequence of the presence of several rela¬ 
tively flat d-bands in the vicinity of the Fermi energy and would not occur to 
any extent in the free-electron model. Notice further that the vector q = 4n/7c 
actually spans a section of the calculated Fermi surface as can be seen in Fig. 
8(b). This is essential to the resistivity anomaly at the transition temperature 
as discussed above. 

In Fig. 8, we have drawn the perturbed Fermi surface in such a way as to 
emphasize its relationship to the unperturbed surface (the light solid curves). 
It should be recognized that the Bloch states which originally had unique 
k values within the double Brilloiun zone for the hep structure are now mix¬ 
tures of k values differing by multiples of q. Therefore, the assignment of a 
particular one-electron state to a particular point in k space is ambiguous to 
this degree. However, the essential feature of the result is that, in the magnetic 
state, the Fermi surface normal to the z-axis is largely destroyed while 
segments parallel to the axis remain, though somewhat perturbed. This 
implies a resistance anomaly parallel but not perpendicular to the crystal 
c-axis in agreement with experiment. It should be pointed out, however, that 
the calculations we have performed are all nonrelativistic and that relativistic 
corrections, including spin-orbit effects, will certainly modify, to some extent, 
the computed Fermi surface. Nevertheless, we feel that the general features 
of the Fermi surface and of the effects of magnetic ordering, should remain 
even when the relativistic effects are taken into account. 

In summary, we can make the following statements: (1) The observed q 
vector spans a section of the calculated Fermi surface of Tm such that a 
section of this Fermi surface perpendicular to the z-axis is destroyed at the 
onset of magnetic ordering. This results in the observed resistivity anomaly 
at T n . (2) Due to the fact that the energy bands are relatively flat in the vicinity 
of the Fermi energy, gaps will occur at superzone boundaries for k. = ±\nq 
as the temperature is lowered below T N , destroying large sections of the Fermi 
surface normal to the z-axis. It appears that the anomalies in the temperature 
dependence of the resistivity of the heavy rare-earth metals can be understood 
in terms of the calculated energy bands of these materials, and that qualitative 
agreement, at least, exists between theory and experiment. A quantitative 
comparison appears difficult at this time because of the expectation of various 
relativistic corrections of other uncertainties in the energy-band structure and 
of the general complex nature of the calculated energy bands. 


VI. Conclusion 

In conclusion, we emphasize that the computed conduction bands for the 
rare-earth metals are not at all represented by the free-electron model but 
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instead are of s-d character and strongly resemble the energy bands of transi¬ 
tion metals. There occurs a high density of states at the Fermi energy as 
implied by specific heat and magnetization data and shown by recent photo¬ 
emission measurements. The additional magnetic moment measured in the 
rare-earth metals over that expected on the basis of the 4f shell alone is due 
to an exchange polarization of the s-d conduction bands. We have also shown, 
using our calculated energy bands and Fermi surfaces, at least qualitative 
agreement may be obtained between theory and experiment for some optical 
and electrical properties despite the limitations inherent in the calculations. 

Finally, we note that the Fermi surface obtained has many unusual features, 
such that it would be very interesting to study the de Haas-van Alphen, 
Schubnikov-de Haas and cyclotron resonance phenomena in these materials. 
Unfortunately, the purity of the rare-earth metals is not at present sufficiently 
high to allow one to make these measurements. However, it may be possible 
to study the Fermi surface topology through magnetoresistance measurements 
and it is to be hoped that such experiments will be undertaken soon. 


ACKNOWLEDGMENTS 

We are indebted to Mrs. A. Furdyna who participated fully in many parts of this work. 
We are grateful to J. H. Wood for making the APW programs available to us and for 
many helpful discussions, and to A. Furdyna and R. Sheshinski for their help with many 
phases of the computations. 


REFERENCES 

Berman, A., Zemansky, M. W., and Boorse, H. A. (1958). Phys. Rev. 109, 70. 

Blodgett, A. J., Spicer, W. E„ and Yu, A. Y-C. (1966). In “ Optical Properties and Electronic 
Structure of Metals and Alloys ” (F. Abeles, ed.), p. 246. North-Holland Publ., 
Amsterdam. 

Colvin, R. V., Legvold, S., and Spedding, F. H. (1960). Phys. Rev. 120, 741. 

Cooper, B. R., and Redington, R. W. (1965). Phys. Rev. Letters 14, 1066. 

Dimmock, J. O., and Freeman, A. J. (1964). Phys. Rev. Letters 13, 750. 

Dimmock, J. O., Freeman, A. J., and Watson, R. E. (1966). In “Optical Properties and 
Electronic Structure of Metals and Alloys” (F. Abeles, ed.), p. 237. North-Holland 
Publ., Amsterdam. 

Elliott, R. J., and Wedgwood, F. A. (1963). Proc. Phys. Soc. 81, 846. 

Freeman, A. J., and Watson, R. E. (1962). Phys. Rev. 127, 2058. 

Hall, P. M., Legvold, S., and Spedding, F. H. (1958). Phys. Rev. 109, 971. 

Hegland, D. E., Legvold, S„ and Spedding, F. H. (1963). Phys. Rev. 131, 158. 

Herman, F., and Skillman, S. (1963). “Atomic Structure Calculations.” Prentice-Hall, 
Englewood Cliffs, New Jersey. 

Jennings, L. D., Miller, R. E., and Spedding, F. H. (1960). J. Chem. Phys. 33, 1849. 
Koehler, W. C. (1965). J. Appl. Phys. 36, 1078S. 

Krebs, K. (1963). Phys. Letters 6, 31. 

Loucks, T. L. (1965). Phys. Rev. 134, A1181, A1333. 


380 


A. J. FREEMAN, J. O. DIMMOCK, AND R. E. WATSON 


Lounasmaa, O. V. (1962). Phys . Rev . 126. 

Lounasmaa, O. V. (1963). Phys. Rev . 129, 2460. 

Lounasmaa, O. V. (1964). /Tzjtf. /ter. 133, A219. 

Mackintosh, A. R. (1962). Phys. Rev. Letters 9, 90. 

Miwa, H. (1963). Progr. Theor. Phys. Japan 29, 477. 

Nigh, H., Legvold, S., and Spedding, F. H. (1963). Phys. Rev. 132, 1092. 

Prange, R. E., and KadanofF, L. P. (1964). Phys . Rev. 134, A566. 

Saffren, M. M. (1959). Ph.D. Thesis, M.I.T., Physics Department. 

Schuler, C. C. (1964). Phys. Letters 12, 84. 

Schuler, C. C. (1966). To be published. 

Slater, J. C. (1937). Phys. Rev. 51, 846. 

Slater, J. C. (1963). Phys. Rev. 92, 603. 

Slater, J. C. (1965). “Quantum Theory of Molecules and Solids,” Vol. II. McGraw-Hill, 
New York. 

Slater, J. C., and Saffren, M. M., (1953). Phys. Rev . 92, 603, 1126. 

Wood, J. H. (1960). Phys . Rev. 117, 714. 


New Studies of the Band Structure of Silicon, 
Germanium, and Grey Tin* 


F. HERMAN, R. L. KORTUM, C. D. KUGLIN, and R. A. SHORT 

LOCKHEED PALO ALTO RESEARCH LABORATORY, PALO ALTO, CALIFORNIA 


I. Introduction . 381 

II. Theoretical Approach. 384 

A. Self-Consistent Energy Band Calculations . 384 

B. Empirical Perturbation Scheme. 384 

C. The Koopmans Correction . 386 

D. Adjustment Strategy. 387 

III. Grey Tin . 388 

A. Energy Band Model . 388 

B. Spin-Orbit Splitting. 389 

IV. Germanium. 393 

A. Energy Band Model . 393 

B. Experimental Evidence. 395 

C. Further Remarks about AE(AF) . 395 

D. Further Remarks about E(NRSC). 399 

E. Deformation Potential Studies. 401 

F. Extreme Pressure Studies. 405 

V. Silicon. 409 

A. Preliminary Remarks . 409 

B. Energy Band Model. 410 

C. Piezoreflectance Studies. 413 

D. Deformation Potential Studies. 415 

E. Photoemission Studies. 416 

F. Electroreflectance Studies. 418 

VI. Germanium-Silicon Alloys. 420 

VIE Concluding Remarks.423 

References. 425 

Note Added in Proof.427 


I. Introduction 

Although a vast amount of experimental and theoretical information is now 
liable concerning the valence and conduction band edges of silicon and 
rmanium—and to a lesser extent grey tin—our detailed knowledge of 
e energy band structure of these crystals away from the band edges is still 
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surprisingly incomplete. The object of the present study is to elucidate, in quan¬ 
titative terms, the nature of these relatively unexplored regions of the band 
structure, and thereby to provide a reliable theoretical guide for experimental 
investigations of optical, photoemissive, and pressure-sensitive properties that 
depend on the band structure away from the band edges. 

It is possible to obtain a reasonably good qualitative picture of the band 
structure over a 10-20 eV range by carrying out first-principles energy band 
calculations, but the accuracy of such calculations is often not high enough 
for detailed analyses of experimental spectra. Another approach is represented 
by the empirical pseudopotential method (Brust, 1964; Cohen and Bergstresser, 
1966), and by the empirical full zone k • p method (Cardona and Poliak, 1966). 
The idea here is to set up a parametric model of the energy band structure, and 
then to adjust the parameters so that the calculated band structure agrees with 
the few known features of the experimental band structure. The task of fitting 
the experimental band structure over a 10-20 eV range with an empirical model 
is a very difficult one, and some sacrifice in local accuracy in the interest of a 
good overall fit is probably unavoidable. In spite of their convenience, empirical 
band models are not nearly as accurate as some of their advocates (cf. Phillips, 
1966) would have us believe. The marked discrepancies between experimental 
and computed values for the imaginary part of the complex dielectric response 
function for silicon and germanium (cf. Brust, 1964, especially Fig. 16) should 
serve as a warning to those who would accept empirical pseudopotential band 
models uncritically. 

In view of the recent development of high-resolution experimental techniques 
for studying band structure, such as modulated electroreflectivity (Seraphin, 
1964, 1965) and modulated piezoreflectivity (Engeler et al., 1965; Gobeli and 
Kane, 1965), the need for high-accuracy band calculations is now more acute 
than ever. We believe that recent attempts to interpret electroreflectivity and 
piezoreflectivity spectra for silicon and germanium in terms of pseudopotential 
band models have been only partially successful because these models are not 
sufficiently accurate for such an exacting task. 

In this article, we describe a method for determining the band structure of 
crystals which is designed to have a higher accuracy than purely first-principles 
or purely empirical (pseudopotential or kp) methods. The first step in our 
approach is a nonrelativistic self-consistent energy band calculation which is 
carried out with far greater care than is customary. In the second step, we 
depart from a purely first-principles approach, and introduce a small empirical 
adjustment which serves to bring key features of the theoretical energy level 
scheme into exact agreement with the most reliably established features of the 
experimental level scheme. Our approach to superior accuracy, then, is based 
on the addition of a small, carefully chosen empirical adjustment to an other¬ 
wise first-principles band calculation. The empirical adjustment proves to be 
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quite small: it is usually sufficient to change a few of the leading Fourier 
coefficients of the self-consistent crystal potential by a few percent each. 

Although the study of crystals whose band structures are poorly under¬ 
stood was foremost in our minds when we set out to develop this empirically 
perturbed self-consistent field method, it seemed worthwhile to test our approach 
on the well-studied diamond-type crystals before attempting the study of other, 
less familiar, crystals. In addition to testing out our method on these crystals, 
we have succeeded in obtaining a considerable amount of new and hopefully 
accurate information about (a) key features of the energy band structure of 
silicon, germanium, and grey tin away from the valence and conduction band 
edges; (b) systematics of the energy band structure of the germanium-silicon 
alloy system away from the band edges; (c) net deformation potential dif¬ 
ferences for many important interband transitions; and (d) systematics of the 
band structure of germanium as a function of lattice constant over an extreme 
lattice constant range. 

We have also examined a number of previous theoretical ideas (cf. Phillips, 
1966) concerning energy level assignments and the identification of interband 
transitions principally responsible for characteristic features of optical spectra. 
We have examined these ideas in the light of the experimental evidence itself, 
in the light of our new energy band models, and in the light of our new deforma¬ 
tion potential calculations. For example, our calculated deformation potentials 
for specific interband transitions were compared with the measured deforma¬ 
tion potentials for supposedly related optical reflectivity peaks. 

Our new results confirm many of the earlier energy level assignments, but 
show, for example, that the X t -X 4 * transition is not really representative of 
the main optical reflectivity peak, a conclusion that has also been reached by 
Kane (1966) on the basis of other considerations. Our results are at variance 
with the common identification of the 3.4 eV reflectivity peak in silicon with 
r i5 -r 2 5 ' or closely related A,-A s transitions (Phillips, 1962, 1964, 1966; 
Cardona and Poliak, 1966; Brust, 1964; Cohen and Bergstresser, 1966). 

In fact, our results suggest that the T, 5 -r 25 . transition has not been properly 
identified previously, not only in silicon, but also in germanium and grey tin. 
In all three of these crystals, we find that this important transition lies at least 
0.5 eV lower in energy than is currently believed. This conclusion has an 
important bearing on current interpretations of reflectivity, electroreflectivity, 
piezoreflectivity, and photoemission spectra since a change in r 15 -r 25 . by 
even 0.5 eV would have a profound effect on the structure of three of the 
four lowest conduction bands in the central region of the reduced zone, and on 
the detailed nature of interband transitions in the range between 2 and 4 eV. 

* Contrary to common usage, we will denote the transition A^-B as B-A since B-A is 
closer in form to the expression for the transition energy, E(B) — E(A). 


384 


F. HERMAN, R. L. KORTUM, C. D. KUGLIN, AND R. A. SHORT 


II. Theoretical Approach 

A. Self-consistent energy band calculations 

Our nonrelativistic self-consistent (NRSC) band calculations are based on 
the orthogonalized plane wave (OPW) method (for a review, see Herman, 
1958), and on Slater’s simplified version of the Hartree-Fock equations (Slater, 
1951). The present calculations are based primarily on the free-electron ex¬ 
change approximation proposed by Slater in his 1951 paper, though other 
exchange approximations are used in selected portions of our work. The free- 
electron exchange approximation has the dual virtue of being physically 
realistic and computationally tractable. 

The mathematical approximations and numerical procedures used in the 
present study are identical to those already described in our 1964 Paris Semi¬ 
conductor Conference paper (Herman, 1964), except that a greatly improved 
set of computer codes are now used in place of the earlier set. Most of the 
differences between theory and experiment noted in our preliminary account 
(Herman, 1964) can be attributed to the use of orthogonality coefficients that 
were not computed with sufficiently high accuracy. Because of cancellation 
effects, a slight inaccuracy (even 1 %) in one term of an OPW matrix element 
can have serious repercussions. In the present work, all terms appearing in 
the OPW matrix elements are accurate to five significant figures. 

We begin our NRSC band calculations by computing preliminary energies 
and wave functions at a set of 32 sample points in the reduced zone on the 
basis of a trial crystal potential. The crystal charge density is then determined 
from this set of sample wave functions. We then iterate, calculating a new 
crystal potential each time from the previously determined crystal charge 
density. The iteration is continued until the crystal charge density is self- 
consistent to better than one part in 10 3 , often to better than one part in 10 4 . 
In our standard NRSC band calculation, the OPW expansion includes about 
120 terms at each of the 32 sample points, which are as follows: 1(0 0 0) + 
6 (i 0 0) + 12 (| \ 0) + 4 (| \ i)+3 (1 0 0)+6 (1 \ 0), where a 
common factor of (2n/a) has been omitted. The overall precision of our stan¬ 
dard NRSC energy level scheme is estimated to be about ±0.05 eV. 

We have checked the convergence of our NRSC calculations by doubling 
the number of sample points [this is accomplished by adding 8 (| i i) +24 
(I i i) to the previous list], and also by increasing the number of OPWs per 
point from about 120 to about 180. Each of these refinements affected the 
NRSC energy level scheme only slightly: energy level separations were usually 
changed by less than +0.0025 eV. 

B. Empirical perturbation scheme 

Instead of attempting to improve the physical rigor of the NRSC energy 
band calculation- by using a more refined exchange approximation, or possibly 
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even by including relativistic, spin-orbit coupling, and electron correlation 
effects—we will take a shortcut, and compensate for our shortcomings in 
physical rigor by means of a crystal potential perturbation A V which is to be 
determined empirically. This perturbation—which may be represented formally 
by nonlocal as well as by local operators—is designed to bring key features of 
the theoretical energy level spectrum into agreement with the most reliably 
established features of the experimental spectrum. In effect, A V will be used to 
generate a set of energy level shifts A£(AK) which transform the NRSC 
energy level scheme, £(NRSC), into a modified (or perturbed) scheme, 
iT(PERT), according to the defining relation: 

£(PERT) = £(NRSC) + A£(A V). 

For convenience, we choose to classify all energy levels according to the 
irreducible representations of the single group. Accordingly, the spin-orbit 
coupling corrections A£(SO) will not be included in A£(AK). Instead, these 
corrections will be subtracted from the (actual) experimental energy level 
spectrum, £(EXPT), thereby generating the (single group) experimental 
spectrum, £*(EXPT): 

£*(EXPT) = £(EXPT) - A£(SO). 

At this stage, our objective is to adjust £(PERT) to £*(EXPT) by a suitable 
choice of A V. Since we are not yet in a position to calculate A£(SO) from first 
principles, we are obliged to use experimental values for A£(SO) in the present 
work. 

The success of our plan depends on our ability to devise a suitable AV. It is 
instructive to make a formal decomposition of AV and then study the effect 
each individual part has on the band structure. If AKis regarded as a local 
function, a Fourier decomposition is appropriate: 

AK(r) = 1(h) Au(h) exp Hit. 

By first-order perturbation theory, we can determine the energy level shifts 
A£(h) produced by the leading Fourier coefficients, Au(h). Each symmetrically 
equivalent set of At>(h)—denoted for short by Av(h )—can be treated as an adjust¬ 
able parameter. In deciding which At?(h) to concentrate on, we can be guided 
to some extent by the relative magnitude of the estimated uncertainties in the 
corresponding self-consistent t>(h). For example, an improvement in our treat¬ 
ment of valence-valence exchange would have the most profound effect on 
v(\ 11), so that Au(l 11) can be regarded as a prime adjustment parameter. 

By modifying the nonlocal repulsive potential arising from the orthogon¬ 
ality terms in the OPW matrix elements, a nonlocal A Kean be simulated. This 
is most readily accomplished by altering the various core energy levels, either 
individually or in selected combinations: £ a (NRSC)->£ a (NRSC) + AEf re , 
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where a = Is, 2s, 2p.... If all the core levels are shifted by a common amount, 
A£ core , the valence and conduction band levels will be shifted in a manner 
described by the symbol A£(CORE). If only the s core levels (Is, 2s ...) are 
displaced from their NRSC values, the shift pattern A£(CORE/s) is obtained, 
and similarly for selective shifts of p and d core levels. 

The relativistic (mass-velocity and Darwin) correction, AE(REL), is a 
special case of A£(AE) which can be evaluated by first-order perturbation 
theory (Herman et ah, 1963). For germanium and grey tin, it is found that 
A£(REL) is nearly proportional to A£(CORE), suggesting that A£(CORE) 
is also a prime adjustment parameter. In these crystals, and presumably in 
others, A£(CORE) can be used to simulate the relativistic correction, as well 
as other corrections associated with the ion core region. 

A typical two-parameter adjustment scheme is based on the two parameters 
Ar(l 11) and A£ core , and has the form 

£(PERT) = £(NRSC) + A£(ll 1) + A£(CORE). 

In practice, a satisfactory adjustment can be carried out by using a Ar(l 11) of 
the order of 0.03i(l 1 l),and a core shift A£ core of the order of 0.2 Ry. Another 
two-parameter scheme is based on Af(lll) and Ai(220) and has the form 

£(PERT) = £(NRSC) + A£(l 11) + A£(220). 

In this case, a satisfactory adjustment usually involves a Ar(l 11) of the order of 
0.03t'(l 11), and a Ar(220) of the order of 0.03 t( 220); here r(l 11) and e( 220) are 
the NRSC Fourier coefficients of crystal potential. 

As will be indicated more fully in our subsequent treatment of germanium, 
the adjusted energy level scheme £(PERT) proves to be relatively insensitive 
to the exact form of the starting point £(NRSC), provided this starting point 
is already in reasonable qualitative agreement with experiment [as all our 
£(NRSC) are]. Moreover, £(PERT) proves to be relatively insensitive to the 
particular adjustment parameters employed, provided these are chosen in a 
reasonable manner. Therefore, £(PERT) is determined primarily by the quali¬ 
tative features of £(NRSC), and by the specific features of £*(EXPT) with 
which £(PERT) is brought into register. In order to obtain the most reliable 
adjusted energy level scheme, the most sensitive features of £(PERT), rather 
than the least sensitive, should be adjusted to £*(EXPT). This important 
practical consideration will be amply illustrated in our subsequent treatments 
of germanium and silicon. 

C. The koopmans correction 

It must be emphasized that exact agreement between £(NRSC) and 
£*(EXPT) should not be expected, because the NRSC calculations are based 
on a simplified model. Apart from the neglect of relativistic and correlation 
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effects, the approximate treatment of exchange leads to two separate difficulties. 
In the first place, electron exchange effects are not treated exactly. In the 
second place, and this is not generally appreciated, Koopmans’ theorem is no 
longer strictly obeyed when a simplified version of the Hartree-Fock exchange 
operator is used (Lindgren, 1965a,b). Consequently, one-electron energy 
eigenvalues do not correspond (exactly) to one-electron binding energies 
(orbital energies), as they do in the rigorous Hartree-Fock formalism. In our 
treatment, the change in the total energy of the system produced by an inter¬ 
band electronic transition—the true measure of the experimental transition 
energy—is only approximately equal to the difference between the initial and 
final NRSC one-electron energy eigenvalues. 

Theoretically, we can introduce a correction—the Koopmans correction, 
A£(KOOP)—which converts one-electron energy eigenvalues into one-electron 
binding energies. If we could actually evaluate this correction, or even estimate 
it reasonably well, our empirical adjustment scheme could be improved con¬ 
siderably, for AE{AV) would no longer have to include A£(KOOP) implicitly. 
We could replace £(PERT) = £(NRSC) + AE(A V) by £(PERT) = £(NRSC) + 
A£(KOOP) + AE'{AV), where AE’(AV) is (hopefully) smaller than AE(AV). 

In the case of atoms and molecules, the Koopmans correction is readily 
determined, since this is essentially the expectation value of the difference 
between the exact Hartree-Fock exchange operator and the approximate 
exchange operator. For systems such as atoms and molecules, the exact 
Hartree-Fock exchange operator can be computed, but for a system as com¬ 
plicated as a crystal, this operator is not easily calculated. Therefore, the task 
of determining the Koopmans correction as part of a crystal calculation seems 
quite formidable. With further study, however, ways may be found to obtain 
good estimates for A£(KOOP), but for the present, we are compelled to include 
A£(KOOP) implicitly in our empirical energy level shift A£(A V). 

D. Adjustment strategy 

Since a principal objective of the present study is the confirmation or con¬ 
tradiction of previous theoretical ideas concerning the nature of the band 
structure away from the valence and conduction band edges, it is obviously 
desirable to ignore these ideas (cf. Phillips, 1962, 1964, 1966; Cardona, 1965; 
Brust et al:, 1962a,b; Brust, 1964; Cohen and Bergstresser, 1966; Cardona 
and Poliak, 1966; Saslow et al., 1966) in setting up our own energy band 
models. Therefore, in attempting to fit theory to experiment, we will rely most 
heavily on well-authenticated experimental information bearing on the direct 
and indirect band gaps. 

In grey tin and germanium, we are fortunate that the direct and indirect 
band gaps are fairly well understood, and that these band gaps happen to be 
defined by the two most sensitive transitions in the £(PERT) scheme, namely, 
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T 2 ,-T 2 y and L 1 -T 25 ’ (cf. Herman and Skillman, 1961). For these two 
crystals, we will use a variety of two-parameter adjustment schemes, obtaining 
values for the two parameters by bringing r 2 <-r 25 < and L l -T 25 - in the 
£(PERT) scheme into exact register with their counterparts in the £*(EXPT) 
scheme. 

In silicon, we must adopt a different strategy, since the direct band gap has 
not yet been established experimentally, and since the indirect band gap is not 
defined by either of the two most sensitive transitions, but rather by a transi¬ 
tion of lesser sensitivity, A7-F 25 /, where m denotes conduction band minimum. 
Our strategy here will be to use the well-known indirect band gap as one 
constraint on a two-parameter adjustment scheme, and then to determine the 
remainder of the band structure as a function of one of the more sensitive 
transitions. In practice, this approach pins down the insensitive transitions 
quite nicely, and leaves only the sensitive transitions somewhat uncertain. 
Recent electroreflectivity measurements (Seraphin, 1965) will then be exam¬ 
ined for possible clues concerning the sensitive transitions. It happens that 
among our one-parameter family of possible energy band models, there are 
two particular models that are favored by alternate theoretical interpretations 
of these measurements. 


III. Grey Tin 


A. Energy band model 

Until the recent work of Groves and Paul (1964), grey tin was thought to 
be a direct band gap semiconductor with a band gap of about 0.08 eV. By a 
careful analysis of transport measurements, Groves and Paul were able to 
show that grey tin is actually a semimetal. For our purposes, it is sufficient to 
make use of Groves and Paul's experimental results in assigning values to the 
direct and indirect band gaps (the former is negative, incidentally). 

In obtaining our results for £(PERT) listed in Table 1, we used the arith¬ 
metic average of four different two-parameter adjustment schemes. In two of 
these schemes, the relativistic corrections were included implicitly in the two 
AE(AV) terms, so that £(PERT) had the form £(PERT) = £(NRSC) + 
AE(AV l ) + A£(AF 2 ). In the remaining two schemes, these corrections were 
taken into account explicitly by writing £(PERT) = £(NRSC) + AE(AV 1 ) + 
AE(AV 2 ) + A£(REL), and using calculated values for A£(REL). In one pair 
of schemes, A£(l 11) and A£(CORE) were used as energy level shifts; in the 
other pair, we used one core shift for valence band levels and another core 
shift for conduction band levels. In spite of considerable differences among the 
four adjustment schemes: 


New Studies of the Band Structure of Silicon, Germanium, and Grey Tin 389 


(a) £(PERT) = £(NRSC) + A£(l 11) + A£(CORE) 

(b) £(PERT) = £(NRSC) + A£(l 11) + A£(CORE) + A£(REL) 

(c) EXPERT) = £(NRSC) + A£,(CORE) + A£ c (CORE) 

(d) £(PERT) = £(NRSC) + A£„(CORE) + A£ c (CORE) + A£(REL), 

where v and c refer to valence and conduction band levels, the deviations from 
the average £(PERT) solution were usually less than 0.1 eV. This remarkable 
insensitivity of £(PERT) to the choice of specific adjustment parameters will 
be considered further in the section on germanium. 

It is noteworthy that the difference between £(NRSC) and £(PERT) is 
about 0.3 or 0.4 eV for most of the transitions listed. Although the relativistic 
corrections are quite large, A£(REL) for r 2 '-r 25 ' being nearly —1.4 eV, 
these corrections are largely offset by other ion-core-type corrections which 
enter through the core shift terms. For example, in scheme (b), it is found 
that A£(CORE) and A£(REL) are nearly proportional to one another, so 
that the sum A£(CORE) + A£(REL) plays the same role in scheme (b) that 
the term A£(CORE) plays in scheme (a). We will say more about such ques¬ 
tions in Section IV,C. 

In Table 1, our average £(PERT) solution for grey tin is compared with the 
corresponding empirical pseudopotential solution £(PSEUDO) recently re¬ 
ported by Cohen and Bergstresser (1966). The two solutions are quite similar, 
the principal difference being in the values of the transition r 15 -r 25 '. 

In germanium and silicon, as well as in grey tin, our predicted value for this 
transition is consistently lower than Cohen and Bergstresser’s calculated value 
by at least 0.5 eV, which is outside the ±0.05 eV numerical uncertainties 
associated with £(PERT) or £(PSEUDO). We believe that our results are 
more reliable than Cohen and Bergstresser’s on the basis of internal evidence, 
such as the insensitivity of £(PERT) to changes in the detailed nature of the 
adjustment procedure, the insensitivity of £(PERT) to changes in the detailed 
nature of £(NRSC) (see the discussion on germanium), and the relative magni¬ 
tudes of our A£(A V) and those of Cohen and Bergstresser (see the discussion 
on silicon). 


B. Spin-orbit splitting 

Since the spin-orbit splitting of the r 25 ' level in grey tin is about 0.7 eV 
(Groves and Paul, 1964), it is expected that the p-like valence and conduction 
bands passing through T 25 ' and Fi 5 will be spin-orbit split in different regions 
of the reduced zone by amounts varying from 0 to about 0.7 eV. The present 
study of grey tin—and that of Cohen and Bergstresser (1966)—merely map 
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are uncertain to a few tenths of an electron volt are enclosed in parentheses. The experimental estimates (0°K) of Cohen and Bergstresser 
are denoted by ££(EXPT), and their pseudopotential solutions by £(PSEUDO). Doubtful identifications are indicated by question marks. All 
energies are in electron volts. Differences between £'*(EXPT) and £o(EXPT) for L,-r 25 . and r 2 .-F 25 . reflect the temperature dependence of 
the indirect and direct band gaps. Differences between £'*(EXPT) and ££(EXPT) for other transitions usually reflect different interpretations of 
experimental information as well as the temperature dependence of the band structure. 
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out the centers of gravity of the various spin-orbit split bands. In order to 
obtain a more realistic picture of the band structure of grey tin, it is obviously 
necessary to take the spin-orbit interaction into account (Herman et al., 1966). 

We are presently developing computer codes for calculating the spin-orbit 
splitting at selected points in the reduced zone by first-order perturbation 
theory. Our projected method for calculating the spin-orbit splitting—or 
what amounts to the same thing, the spin-orbit coupling corrections AE(SO )— 
will parallel the method we are already using for computing the relativistic 
corrections A£(REL) from first principles. 

Until such time as these computer codes become operational, we are plan¬ 
ning to determine the spin-orbit splitting by a simpler, but less comprehen¬ 
sive, method. In particular, we are adopting the extended k - p representation 
of Cardona and Poliak (1966), but we are fitting the parameterized k*p 
energy band model (with spin-orbit splitting ignored) to our own energy 
level scheme, £(PERT), rather than to fragmentary experimental information. 
One of the most useful features of the extended k *p representation is the ease 
with which the spin-orbit interaction can be taken into account. In its present 
form, however, the k*p treatment of spin-orbit interaction is somewhat 
oversimplified, in that the number of relevant parameters must be reduced to 
as few as can be fitted to experiment (or judiciously estimated). We plan to 
determine the spin-orbit splitting in grey tin in the same manner that 
Cardona and Poliak (1966) determined this splitting in germanium—at 
least for the present. In due course, we will be able to determine the spin- 
orbit splitting throughout the reduced zone by using the extended k • p 
representation as an interpolation scheme: The relevant parameters will be 
determined partly by adjusting the k*p energy band model to experiment 
(transition energies and spin-orbit splitting), and partly by adjusting this 
model to our own £(PERT) and A£(SO) results. At any rate, our understand¬ 
ing of the band structure of grey tin will remain somewhat rudimentary until 
the spin-orbit splitting throughout the zone is taken into account. 

It would also be highly desirable to have additional experimental informa¬ 
tion concerning the band structure, so that current theoretical estimates of 
various interband transitions could be checked. 

We have learned only recently that the electroreflectivity spectrum of grey 
tin has been determined by Cardona, McElroy, Poliak, and Shaklee using their 
electrolytic technique* (M. Cardona, invited paper, Durham Meeting of the 
American Physical Society, March 31, 1966). Since the successful interpreta¬ 
tion of this spectrum would resolve many current questions concerning the 
band structure of grey tin away from the band edges, we are awaiting further 
developments with keen anticipation. 


* Shaklee et al., 1965, 1966; Cardona et al., 1966. 
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IV. Germanium 


A. Energy band model 

Since the direct and indirect band gaps of germanium are accurately known, 
and since they are defined by the two most sensitive transitions, r 2 '-r 25 ' and 
L\-T 2 s', it is expedient to proceed just as we did in the case of grey tin, and 
obtain exact fits for these two transitions by using two-parameter adjustment 
schemes. In studying germanium, we examined an even larger number of 
two-parameter schemes than we did in studying grey tin. In Table 1, under 
the heading E’(PERT), we have displayed the arithmetic average of twelve 
different two-parameter E(PERT) solutions. The individual solutions usually 
deviated from the average solution by less than 0.1 eV, though in a few scat¬ 
tered instances the variation was as large as +0.2 eV. 

Having established our own energy band model, let us now compare our 
results for germanium with those obtained by Cohen and Bergstresser (1966). 
These authors adjusted their empirical pseudopotential model, £(PSEUDO), 
to their best estimate of the experimental energy level scheme at absolute zero 
temperature, Eo(EXPT). In our work, E(PERT) was adjusted to two tran¬ 
sitions in £*(EXPT), the room temperature experimental scheme. This 
slight difference of approach is of minor importance in any comparison of 
£(PERT) with £(PSEUDO), since the difference between £*(EXPT) and 
£o(EXPT) is at most 0.1 eV, a value comparable to the combined numerical 
uncertainties in £(PERT) and £(PSEUDO). 

Cohen and Bergstresser obtain the best compromise fit they can using three 
empirical pseudopotential parameters analogous to our empirical crystal 
potential perturbations Ay(lll), Au(220), and Ay(311). We wish to emphasize 
that our Ay(h) are not pseudopotential coefficients, but rather empirical 
modifications of our NRSC crystal potential. Since the starting point of their 
empirical adjustment procedure is considerably different from our own, as will 
be explained more fully in our subsequent discussion of silicon (cf. Section 
V,B), it is doubtful whether nearly identical £(PERT) and £(PSEUDO) 
schemes could be generated by the use of analogous adjustment parameters, 
even if there were complete agreement on both sides concerning the experi¬ 
mental band scheme. In practice, £(PERT) and £(PSEUDO) are somewhat 
different, partly because the adjustment procedures and starting points are 
different, and partly because the physical assumptions concerning the experi¬ 
mental situation are different. 

As can be seen from Table 1, and also from Fig. 1, £(PERT) and 
£(PSEUDO) are qualitatively similar, but quantitatively different in detail, 
particularly in the neighborhood of the important conduction band state 
r 15 , whose location determines the disposition of three of the four lowest 
conduction bands in the central region of the reduced zone. Even though 
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Fig. I . Comparison of two energy band models for germanium (spin-orbit splitting 
neglected). The present solution, £(PERT), is based on a two-parameter adjustment, with 
Lj-Tzs, = 0.76 eV, and r 2 ,-r 25 . = 0.90 eV. The Cohen and Bergstresser (1966) solution is 
based on a three-parameter pseudopotential band model. In both models, spin-orbit split 
levels are represented by their weighted means. 

EfPSEUDO) has been fitted to all known or estimated transitions, while 
EXPERT) has been fitted only to the two most precisely known (and most 
sensitive) transitions, EXPERT) provides at least as good a representation of 
the overall energy band structure as does EfPSEUDO), judging from Table 1. 
In view of our adjustment procedure, EXPERT) provides a better account of 
the direct and indirect band gaps than does E^PSEUDO), and judging from 
Fig. 1, a better account of the highest valence band and the lowest conduction 
band throughout the entire reduced zone as well. 

After commenting briefly on the experimental evidence bearing on the 
ris-r 2 s' transition, we will examine the energy level shifts AE'fAF) in some¬ 
what greater detail than previously, and indicate that an extremely flexible 
adjustment scheme can be developed in terms of only three parameters, such 
as Ar(lll), Ar(220), and Ar(311). We will then demonstrate that an adjust¬ 
ment scheme based on these three parameters favors our value of 2.7 ± 0.2 eV 
for r 13 -r 23 ., rather than Cohen and Bergstresser’s considerably higher value 
of 3.5 eV. After presenting further theoretical arguments in support of our 
two-parameter EXPERT) solution, we will turn to deformation potential 
calculations and related matters. 
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B. Experimental evidence 

In view of the significant difference between our predicted value for the 
r 15 -r 25 - transition, 2.7 + 0.2 eV, and Cohen and Bergstresser’s value of 3.5 
eV, which was obtained with an eye on their experimental assignment, 3.4 eV, 
we have reviewed the experimental evidence bearing on this transition, and 
we find no conclusive evidence supporting the 3.4-eV assignment. For example, 
in a theoretical paper discussing the photoelectric yield from cesium covered 
germanium, Cohen and Phillips (1965) show an experimental curve (due to 
Allen and Gobeli) and indicate where the spin-orbit split r 15 -r 25 < transition 
is expected to lie. The experimental evidence supporting this “expectation” 
is obscure, and what is more, only the spin-orbit splitting of T 25 < is taken 
into account, and that of T 15 , which is of the same order of magnitude as 
the r 25 ' spin-orbit splitting, is ignored. 

The modulated electroreflectivity spectrum of germanium, as determined by 
Seraphin (1964) in his pioneering studies, exhibits four peaks in the range 
from 2.8 to 4.0 eV, the principal peak being at 3.65 eV. Seraphin has ten¬ 
tatively suggested that these four peaks are somehow related to interband 
transitions between spin-orbit split T 25 , and T 15 states (or closely related 
ones). In the light of our model, the weaker structure between 2.8 and 3.3 eV 
could be attributed to such transitions, but the principal peak at 3.65 eV 
would have to be assigned to interband transitions not directly related to 

5-r 2 5'. 

As will be indicated more fully in our subsequent discussion of silicon, 
current theories of electroreflectance are not sufficiently comprehensive to 
provide clearcut and unambiguous interpretations of many features of 
electroreflectivity spectra. Moreover, the intrinsic nature of some characteris¬ 
tic features of such spectra can be obscured by unfavorable experimental 
conditions. Therefore, we must await further theoretical and experimental 
developments before attempting to decide the question of the ris-r 25 ' 
transition on the basis of Seraphin’s exploratory electroreflectivity measure¬ 
ments. In any event, Seraphin’s work has paved the way for a host of high- 
resolution optical studies which should contribute greatly to our 
understanding of the band structure of germanium and other crystals away 
from their band edges. (See Note Added in Proof.) 

C. Further remarks about A£(AF) 

We noted earlier that the deviations of twelve different two-parameter 
£(PERT) solutions from their arithmetic average were usually less than 0.1 
eV. This is indeed a remarkable result, since it suggests that the detailed form 
of £(PERT) is not critically dependent on the choice of adjustment para¬ 
meters. Actually, this result is partly due to our fitting the most sensitive 
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transitions to experiment: the remaining transitions, being less sensitive than 
the fitted ones, are affected only slightly—but often significantly—by the 
energy shifts AE{AV). For example, r 15 -r 2 5 ' is shifted from the £"(NRSC) 
value of 3.0 eV to the F(PERT) value of 2.7 ± 0.2 eV. 

Further insight into the nature of the AE(AV) energy shifts can be 
gained by studying Table 2, where the key transition energy shift patterns 
for germanium, suitably normalized, are arranged into four classes according 
to their broad qualitative features. The patterns within any one class are 
more nearly alike than are patterns belonging to different classes. Since we 
are primarily concerned with valence and conduction band levels derived 
from atomic s and p states, and since these levels are not particularly sen¬ 
sitive to physically reasonable changes in the 3d core level, the A£(CORE/d) 
pattern can be ignored, leaving only three classes to be considered further. 

If two AE(AV) terms associated with the same shift pattern are employed in 
a two-parameter adjustment to two experimental transitions, a physically 
reasonable fit involving small AV cannot be obtained, since the two AE{AV) 
terms are too nearly alike, i.e., too nearly linearly dependent. [The question of 
linear dependence in empirical pseudopotential adjustment schemes has been 
studied by Kane (1966) in an informative paper.] On the other hand, quite 
reasonable fits involving small AV can be obtained by using two AE{AV) 
terms associated with different shift patterns. Since it is primarily the 
pattern that counts, rather than the specific member of the pattern, it can be 
seen that many different two-parameter schemes can be devised, but that they 
all represent essentially three different combinations: patterns 1 and 2, 
patterns 2 and 3, and patterns 3 and 1. In setting up twelve different two- 
parameter adjustment schemes for germanium, we used representatives of all 
three combinations. 

We are now in a position to choose the most flexible adjustment scheme 
involving the minimum number of adjustment parameters: we simply have to 
include one representative from each of the three major patterns. Accordingly, 
let us set up a three-parameter scheme based on At(l 11), Ar(220), and Ar(311). 
Not only do these three parameters belong to different patterns, but they also 
happen to be analogous to the three pseudopotential parameters used by 
Cohen and Bergstresser. Since we now have three parameters at our disposal, 
we will adjust r 2 ,-r 25 . and L x - r 25 , to their proper £*(EXPT) values as 
before, and we will use an assumed value for r i5-r 25 ' as our third constraint. 
In order to compare our previous average two-parameter £(PERT) solution 
with Cohen and Bergstresser’s three-parameter £(PSEUDO) solution, we will 
choose values for r 15 -r 25 . which correspond to these two solutions. How¬ 
ever, we will use 2.8 rather than 2.7 eV for £(PERT) for a reason that will be 
indicated shortly. Setting Fi 5 -r 25 , first to 2.8 and then to 3.5 eV, we obtain 
the following three-parameter £(PERT) solutions: 
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ri5-r 2V 

AT-r 25 , 

Lx-Ly 

Xi-Xa 

^3 _ ^3' 

£(PERT) = 2.80 

0.99 

2.05 

4.07 

5.33 eV 

£(PERT) = 3.50 

0.74 

1.88 

3.48 

5.87 eV 

For comparison: 





£*(EXPT) = 

0.96 

~2.1 

~4.1 

(5.4) eV 


It is now clear that an exact fit to the direct and indirect band gaps, coupled 
with a choice of 2.8 eV for r i5 -r 25 -, leads to a three-parameter £(PERT) 
model which is nearly identical to our previous average two-parameter 
EXPERT) model, and far more important, which is in excellent agreement with 
experiment. On the other hand, when we adjust the direct and indirect band 
gaps to experiment, and then adjust ri5-r 25 ' to 3.5 eV—Cohen and Berg- 
stresser’s value—the agreement between theory and experiment for some of 
the other key transitions is ruined. In view of the flexible nature of our three- 
parameter fit, we take this comparison as a strong argument in favor of our 
value of 2.8 eV for this transition. 

As a further argument, we observe that if the magnitude of A V is plotted as 
a function of the value assigned to r 15 -r 25 ', this curve has a sharp minimum 
at F 15 -r 25 . = 2.8 eV. (This is why we used 2.8 rather than 2.7 eV in the above 
comparison.) In other words, if we accept £(NRSC) as a sound starting point 
for a small empirical adjustment, if we use a very flexible three-parameter 
adjustment scheme, and if we require the adjustment to give the correct 
experimental values for the direct and indirect transitions, then the smallest 
empirical adjustment consistent with these considerations is one giving a 
value of 2.8 eV for the r 15 -r 25 - transition. 

D. Further remarks about £(NRSC) 

Having already indicated that the route from £(NRSC) to £(PERT) via 
A£(A V) is not critically dependent upon the detailed nature of AV, we will 
now indicate, more by way of example than in general terms, that the choice 
of £(NRSC) is also not critical, provided £(NRSC) is already in good qualita¬ 
tive agreement with experiment in its own right. In particular, we will show 
what happens when the NRSC energy band structure of germanium is 
recalculated using the free-electron exchange approximation recently proposed 
by Kohn and Sham (1965) in place of Slater’s. The Kohn-Sham exchange 
term, which is derived from a variational principle, is proportional to Slater’s, 
but only 2/3 as large, so that the substitution of the former for the latter 
leads to a significant change in £(NRSC), as can be seen from Table 3. 
However, even though the £(NRSC) scheme based on the Kohn-Sham 
exchange term differs from that based on Slater’s by as much as 1 eV, both 
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schemes lead to adjusted E(PERT) schemes which differ from one another by 
less than 0.1 eV. Thus our point is illustrated. 

As a brief digression, we observe that the E(NRSC) scheme based on the 
Kohn-Sham exchange term compares as well with E*(EXPT) as the E(NRSC) 
scheme based on Slater’s, but that an intermediate scheme compares more 
favorably with experiment than either of the others. This suggests a generaliza¬ 
tion of our adjustment scheme according to which the Slater free-electron 
exchange term is multiplied by a factor X which is to be determined empiri¬ 
cally. (The Kohn-Sham exchange approximation, based on theoretical 
considerations, corresponds to X = 2/3.) We can obtain a family of NRSC 

TABLE 3 


Effect of Changing the Magnitude of the Free-Electron Exchange Term on the 
Nonrelativistic Self-Consistent (NRSC) Energy Band Structure of Germanium 0 


Transition 

Nonrelativistic self-consistent 
solution, E^NRSC) 

Adjusted 

solution 

£(PERT) 

Experi¬ 

ment 

£*(EXPT) 

A = 2/3 
Kohn-Sham 

A = 5/6 
Intermediate 

A = 1 
Slater 

Li~ T 2 5' 

0.44 

0.54 

0.66 

0.76 

0.76 

F 2'- 1^2 5 ' 

0.80 

0.50 

0.18 

0.90 

0.90 

X\r% r 25 ' 

0.60 

1.06 

1.52 

1.2 

-1.2 

Ly-L v 

1.82 

1.82 

1.85 

2.1 

-2.1 

Ti5- r 25 ' 

2.50 

2.75 

3.01 

2.7 


Xv-X* 

3.70 

3.94 

4.20 

4.1 

-4.1 

Li-Ly 

5.07 

5.31 

5.56 

5.3 

(5.4) 


“ The purpose of this comparison is not to judge the relative merits of the Kohn-Sham 
and Slater free-electron exchange approximations, but rather to show that the magnitude 
of the free-electron exchange term can be modified empirically, through the use of the A 
factor, so as to bring £ A (NRSC) closer to £*(EXPT). A number of previous authors have 
used A factors other than 1 to improve the agreement between theory and experiment; see, 
for example, Lindgren (1965b). The reader should bear in mind that the optimum value of A 
may be different from crystal to crystal. For example, preliminary studies indicate that the 
optimum value of A is closer to unity in silicon than in germanium. The column £ A (NRSC), 
A = 1, corresponds to the germanium £(NRSC) column in Table 1. The columns £(PERT) 
and £*(EXPT) have the same meanings and values here as in Table 1. All energies are in 
electron volts. 

solutions by using different values of X, as well as a corresponding family of 
adjusted solutions: 


EXPERT) = E A (NRSC) + AE X (AV). 

EXPERT) can now be adjusted to E*(EXPT) partially by the choice of X, and 
partially by the choice of AV. The greater the fractional role played by X in 
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bringing EXPERT) close to £*(EXPT), the smaller the role that AV itself 
must play. In many respects, it would be advantageous to minimize AEfAV ) 
by a suitable choice of X, so that E^NRSC) itself is as close to £*(EXPT) as 
possible. Preliminary studies indicate that X serves well as an adjustment 
parameter in its own right, but that all the key transitions in the EXPERT) 
scheme cannot be brought into exact register with their experimental counter¬ 
parts by the use of the X adjustment alone. At any rate, we see that a reduction 
in AEfAV) goes hand in hand with a shift of the T 15 -r 25 , transition from 3.0 
to 2.75 eV. This is one further indication of the most likely value for this 
transition; cf. Table 3. 

E. Deformation potential studies 

As an application of our perturbed self-consistent field approach to energy 
band calculations, we undertook a study of the effect of hydrostatic pressure 
on the energy band structure of germanium, hoping to improve upon an 
earlier and somewhat unsatisfactory pseudopotential study by Bassani and 
Brust (1963). Having obtained £(NRSC), AE{AV), and £(PERT) for ger¬ 
manium at normal pressure (lattice constant = a 0 ), we proceeded to obtain 
E^NRSC) for a number of other lattice constants; expressed in units of 
a 0 , these included: a/a 0 = 0.90, 0.95, 1.05, and 1.10. Using three-point 
interpolation (< aja 0 = 0.95, 1.00, 1.05), we obtained values for the deformation 
potentials D (change in energy per unit dilatation) for the various energy levels 
of interest (cf. Bardeen and Shockley, 1950). These deformation potentials will 
be denoted by the symbol Z>(NRSC) to indicate that they are based on the 
E(NRSC) scheme. To check the numerical accuracy of the three-point 
interpolation, we repeated the deformation potential calculations using five- 
point interpolation (a/a 0 = 0.90, 0.95, 1.00, 1.05, 1.10). The five-point results 
agreed with the three-point results to a few percent. 

Since we have made no serious attempt to determine how the zero of energy 
changes with lattice constant, it would not be particularly meaningful to quote 
our absolute deformation potential values for individual levels. Instead, we 
will list the net deformation potential differences—also denoted by 
D(NRSC)—for several key transitions, and compare them with experiment, as 
well as with other theoretical results. All of this is shown in Table 4. 

Our present deformation potential values could be improved by taking the 
lattice-constant dependence of AE(AV) into account, and basing the defor¬ 
mation potential calculation on the £(PERT) scheme rather than on the 
E(NRSC) scheme. The work of Kleinman (1963) suggests that the lattice- 
constant dependence of the spin-orbit corrections AE(SO) should also be 
taken into account in any serious treatment of deformation potentials. In 
the case of germanium, we estimate that the inclusion of these two refine¬ 
ments could change our Z>(NRSC) values by as much as 10%. Of course, our 


TABLE 4 

Comparison of Theoretical and Experimental Net Deformation Potential Differences for Silicon and Germanium, in eV Per Unit 
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T>(NRSC) for the direct r 2 -r 25 < and indirect L l -L 25 , and A7-r 25 < 
transitions are sufficiently close to the corresponding experimental values to 
give us some measure of confidence in the quantitative validity of our (sim¬ 
plified) deformation potential calculations. 

Zallen (1964) has measured the pressure dependence of a number of 
optical reflectivity peaks in germanium, silicon, and several other semi¬ 
conductors. From these measurements he is able to deduce the deformation 
potentials associated with these peaks. Some of his results are listed in Table 
4. In more recent work, Zallen (private communication) has obtained the 
value Z)(EXPT) = —4.2 +0.4 eV for the main reflectivity peak in ger¬ 
manium, which occurs at an energy of 4.5 eV. Even though it is known from 
the detailed survey of the reduced zone carried out by Brust (1964, 1965), 
and from related studies by Kane (1966), that an extended region of the 
reduced zone contributes to this peak, it is convenient, though somewhat 
misleading, to associate this peak with the X 1 -X 4 transition. 

According to Kane, whose study was confined to silicon but whose results 
are also applicable to germanium, the X 1 -X 4 transition is 0.2 eV lower in 
energy than the e 2 (co) peak, which in turn is 0.2 eV lower in energy than the 
main reflectivity peak. [This places the X 1 -X 4 transition energy (in both 
germanium and silicon) at about 4.1 eV, in excellent agreement with our 
theoretical predictions (cf. Tables 1 and 5).] Kane also finds that the region 
in the reduced zone near the X point makes only a small contribution to the 
main reflectivity peak and to the imaginary part of the complex dielectric 
constant, e 2 (co), which is a more direct measure of the joint interband density 
of states. Therefore, even though the X 1 ~X 4 transition may lie close in energy 
to the main peak, this transition should not be regarded as representative. It is 
hardly surprising, then, that our theoretical deformation potential value for 
X l -X 4 in germanium, Z»(NRSC) = -2.5 eV, is considerably different from 
Zallen’s experimental value, Z»(EXPT) = -4.2+0.4 eV. The magnitude 
of this difference merely serves to emphasize how nonrepresentative of the 
main peak the X l -X 4 transition really is. In order to account properly for the 
experimental deformation potential value for the 4.5-eV reflectivity peak, it 
would be necessary to make a detailed survey of the reduced zone, in the 
manner of Brust (1964, 1965), and to weigh the deformation potentials of the 
various contributing interband transitions according to their importance. 

The experimental deformation potential for the 2.2-eV reflectivity peak— 
this is really a 2.1, 2.3 eV doublet—is -5.7 +0.3 eV (Zallen, 1964; Ger- 
hardt, 1965a,b). The 2.2-eV peak is nominally assigned to a Ai-A 3 transition 
associated with a critical point lying along the [111] axis, somewhere between 
L and T, but closer to T than to L (Brust, 1964; Cardona and Poliak, 1966). 
According to our calculations, £>(NRSC) remains close to - 5.0 eV most 
of the way from L to T, changing rapidly to -10.3 eV only in-the neighbor- 
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hood of r. [Note that Z)(NRSC) = —10.3 eV is our value for the r 2 -r 25 , 
transition (cf. Table 4.)] Since the A t —A 3 critical point is located in the range 
where D(NRSC) is close to —5.8 eV, our calculations are consistent with 
experiment so far as the 2.2-eV peak is concerned. 

In any event, it is clear that accurate and comprehensive deformation 
potential information—both theoretical and experimental—could be used to 
considerable advantage in checking the identity of interband transitions 
principally responsible for characteristic features of optical spectra. One of 
the most exciting prospects for future research is the determination of the 
deformation potential of an individual transition, rather than the integral 
value of the deformation potential taken over all the transitions that contri¬ 
bute to a broad optical reflectivity peak. Since critical point transitions are 
brought out quite clearly in electro reflectivity spectra, the deformation poten¬ 
tials of such transitions could be determined by measuring the hydrostatic 
pressure dependence of the electroreflectivity spectrum. 

F. Extreme pressure studies 

Just as the study of the pressure (or lattice constant) dependence of the band 
structure in the neighborhood of a 0 leads to information concerning defor¬ 
mation potentials, so the same study over an extended lattice constant range 
casts fresh light on the dramatic changes in band structure that would be 
produced by extreme pressures, say millions of atmospheres (assuming 
germanium retained the diamond structure at these pressures). The connection 
between the energy band spectrum and the free atom energy levels is also 
illuminated by such a study. 

It is known that there are at least three high-pressure modifications of ger¬ 
manium (Bates et al., 1965), namely, Ge II (white tin structure); Ge III 
(body-centered tetragonal structure), and Ge IV (modified body-centered 
cubic structure). In this notation, ordinary germanium is called Ge I (dia¬ 
mond structure). For a discussion of the crystal structures of some high- 
pressure modifications of silicon and germanium, see, for example, Kasper 
and Richards (1964). At room temperature, and under equilibrium conditions, 
Ge I converts to Ge III at about 25,000 atm, while Ge III converts to Ge IV 
at about 120,000 atm. Under nonequilibrium conditions, Ge I can persist in a 
metastable form above 25,000 atm, but at about 120,000 atm, Ge I converts to 
Ge II.* 

In the present study, we shall consider only diamond-type germanium, Ge 
I, though we hope to study some of the high pressure modifications at a later 

* In view of the experimental difficulties associated with high pressure measurements, 
particularly the establishment of pure hydrostatic pressure, the stability ranges of Ge I 
through IV are not known with certainty. The values quoted above seem to be the best 
available at this time, but they should be accepted with caution. The authors wish to acknow¬ 
ledge an informative correspondence with Professor William Klement on this point. 
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date. It is beyond the scope of our present treatment, of course, to predict 
the critical pressures at which phase transitions occur. Our attention will be 
focused primarily on the pressure dependence of the electronic energy level 
scheme. 

The energy level scheme of Ge I at six different points in the reduced zone 
is depicted as a function of lattice constant in Figs. 2 and 3. At the normal 
lattice constant, a = a 0 = 5.657 A, the band structure shown in Figs. 2 and 3 
corresponds to the £(PERT) band structure displayed previously in Fig. 1. 
The band structure in the range 0.6 ^ a/a 0 ^ 1.1 was determined from the 
relation 

£(PERT; a) = £(NRSC; a) + A£(A V ; a 0 ), 

where £(NRSC; a) is the NRSC solution for lattice constant a, and A £ (AV; 
a 0 ) is the energy level shift pattern obtained previously by fitting 

£(PERT) = £(PERT; a 0 ) = £(NRSC; a 0 ) + A£(A V; a 0 ) 

to £*(EXPT). Since the difference between AE(AV;a) and AE(AV; a 0 ) is 
estimated to be reasonably small in this range, A£(AF;a 0 )—which we 
know—is used in place of A£(AF;a) throughout this range. In order to 
insure the proper asymptotic behavior of the energy band structure at large 
values of a/a 0 , i.e., a gradual approach to the atomic energy level scheme, it 
was necessary to assume that A£(AF; a) approached zero as a/a 0 approached 
infinity. In practice, A£(AF; a ) was required to vary linearly from A£(AF; a 0 ) 
at a/a 0 = 1.1 to zero at aja 0 = 1.7. 

Our band calculations were carried out for several values of a/a 0 . Since Ge I 
converts to Ge II at about a/a 0 = 0.95, there is no practical need to study Ge I 
for values of a/a 0 much below say 0.9. However, for the sake of completeness, 
we carried our calculations down to values of aja 0 as small as 0.6. At this 
ratio, the 3d core band begins to overlap the 4s valence band. 

At a/a 0 = 1, the core band atomic orbitals located on adjacent lattice sites 
do not overlap each other to any significant extent. As a/a 0 is reduced below 
1, the nearest neighbor overlap increases gradually, most noticeably among 
the 3d atomic orbitals. For the purposes of the present study, it did not appear 
necessary to take this overlap into account. If we were seriously interested in 
obtaining quantitatively reliable results for aja 0 below about 0.8, core 
orbital overlap effects would have to be included in our theoretical treatment. 

Our calculations were carried up to a/a 0 ratios as large as 1.8, where it 
became apparent that OPW expansions containing more than our 120 terms 
would be required to represent the valence and conduction band wave 
functions properly. We believe that Figs. 2 and 3 are qualitatively reliable 
over the entire range shown, and quantitatively reliable in the restricted range 
between aja Q = 0.8 and 1.2. It is indeed gratifying that the OPW method can 
be used so effectively over so broad a lattice constant range. 
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Fig. 2. Energy band structure of germanium as a function of lattice constant at 
r(0 0 0), AQ 0 0), and X{\ 0 0). The lattice constant at normal pressure is denoted by a 0 . 
The zero of energy is placed at the T 2S , level. 



o/a 0 a/a 0 a/a 0 

Fig. 3. Energy band structure of germanium as a function of lattice constant at 
1 0), W( 1 i 0), and L{\ i i). The lattice constant at normal pressure is denoted by a 0 . 
The zero of energy is placed at the T 25 , level. 
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Let us now examine Figs. 2 and 3 in some detail. As the lattice constant is 
reduced (from large values), there is no forbidden band across the entire 
Brillouin zone until the lowest conduction band level, T 2 ', is swept above the 
highest valence band level, r 25 -, at about a = 1.04 a 0 . As the lattice constant 
is reduced further, the distinction of being the lowest conduction band level 
switches from r 2 - to L x just slightly above a = a 0 . At the normal lattice 
constant, r 2 - lies only 0.14 eV above L x , while A? lies only 0.06 eV above 
r 2 - (cf. Table 1 and Fig. 1). As the lattice constant is reduced still further, L x 
rises above A? at about a = 0.98 a 0 , and the conduction band edge switches 
once more. According to our theoretical work, the switch from [111] to [100] 
conduction band minima occurs at about 50,000 atm, which is consistent with 
experiment (Slykhouse and Drickamer, 1958), and with earlier theoretical 
discussions (see, for example, Herman, 1954). Of course, Slykhouse and 
Drickamer were studying Ge I in a metastable form, rather than Ge III, which 
can also exist at 50,000 atm. 

It has been suggested by Musgrave (1964) that there is a relationship 
between the critical (phase transition) pressure and the energy gap in ger¬ 
manium and related semiconductors. In his view, the change from semicon¬ 
ducting to metallic behavior, associated with the change in crystal structure 
at the critical pressure, is brought about by a destabilizing electronic con¬ 
figuration which can occur when the conduction band becomes critically 
populated, presumably due to an overlap or near overlap of valence and 
conduction bands. Our results do not support this view; we find that the 
conduction band does not begin to overlap the valence band until the lattice 
constant is reduced to about 0.8 a 0 , which corresponds to a pressure of about 
475,000 atm, or nearly four times the pressure required to transform Ge I to 
Ge II. As we have already said, Ge I can be transformed into Ge III at only 
25,000 atm under equilibrium conditions (at room temperature). Thus, there 
appears to be no direct connection between the occurrence of phase transfor¬ 
mations and the incipient overlap of valence and conduction bands, at least 
in germanium. 

When energy level vs. lattice constant diagrams are derived using the 
Wigner-Seitz cellular method, the zero of energy at different lattice constants 
is determined by the cellular boundary conditions, as is noted by Kimball 
(1935) in his paper on diamond, and also by Bardeen and Shockley (1950) in 
their paper on deformation potentials in diamond-type crystals. The variation 
of the zero of energy with lattice constant is therefore automatically included 
in such diagrams, but this variation does not necessarily have any precise 
physical significance. In our own work, we decided to use the r 25 , level, which 
defines the top-most valence band level at a 0 , as the zero of energy at all lattice 
constants. This representation was found to be particularly suitable for 
delineating the variation of transition energies with lattice constant. 
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With this choice of reference level, some energy levels such as T 2 - and 
rise monotonically as the lattice constant is reduced, while others such as 
Hs and L 3 first rise, then reach a plateau, and finally fall. Nonmonotonic 
behavior of this type can often be traced to a changing admixture of s, p, and 
d character in the crystal wave functions within the ion core regions. For 
example, at very large interatomic distances, the wave functions for the lower 
and upper r 15 states shown in Fig. 2 can be represented by linear combina¬ 
tions of 4p and 4d atomic orbitals, respectively. As the interatomic distance is 
decreased, these two states interact more and more strongly, until at suffi¬ 
ciently small interatomic distances the lower state is so strongly repelled by 
the upper that its previous upward trend is arrested and ultimately reversed. 
At the same time, the lower T! 5 state gains more and more 4d atomic charac¬ 
ter at the expense of the 4p. 

Even though the upper valence bands and the lower conduction bands are 
derived primarily from atomic 4s and 4p levels, their detailed form at a 0 
is strongly influenced by their proximity to the higher-lying conduction bands 
derived from atomic 4d (and also 5s and 5p) levels. This influence is most 
apparent when the lattice constant is changed, since this leads to changes in 
the fractional 4s, 4p, and 4d atomic characters of the states bordering the 
forbidden band. The upper valence bands, like the conduction bands passing 
through the state r 15 , gain 4d atomic character at the expense of 4p atomic 
character, as a/a 0 is reduced below 1. It remains to be seen whether this 
dramatic change in the orbital character of the upper valence band states is an 
important factor in favoring the transformation of germanium from tetra- 
hedrally coordinated Ge I to the high-pressure modifications (Ge II, III, IV), 
which have distorted tetrahedral coordination. 

V. Silicon 


A. Preliminary remarks 

Although our knowledge of the valence and conduction band edges in 
silicon is quite extensive, the remainder of the band structure is understood 
more in a qualitative than a quantitative sense. There has been considerable 
speculation about energy level assignments and the interpretation of the 
optical reflectivity spectrum (for a review, see Phillips, 1966), but our detailed 
understanding of the relationship between the energy band structure and 
reflectivity spectrum remains incomplete. The theoretical studies of Brust 
(1964, 1965) and Kane (1966) have provided a good picture of the interband 
transitions primarily responsible for the main reflectivity peak at 4.5 eV and 
the broad but weaker peak at about 5.3 eV. These authors have also been able 
to generate a 3.4-eV peak in their theoretical spectrum, but since this result is 
much more sensitive to details of their energy band model than are their other 
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results, we do not believe that their theoretical 3.4-eV peak is necessarily 
related to its experimental counterpart. In short, we still regard the nature of 
the 3.4-eV reflectivity peak as an open question. 

Most authors (Phillips, 1962, 1966; Brust, 1964, 1965; Cohen and Berg- 
stresser, 1966; Cardona and Poliak, 1966; Kane, 1966) assign the 3.4-eV 
reflectivity peak to the r 15 -r 25 < transition (or to closely related A 1 -A 5 
transitions). The r 15 -r 25 - assignment is of course nominal, since such a 
transition would give rise to an edge rather than a peak in the reflectivity 
spectrum. The general consensus seems to be that interband transitions near 
the zone center, of the A x -A 5 type, are the primary contributors to the 3.4-eV 
peak. 

Since we regard the experimental evidence bearing on this assignment as 
inconclusive, we will simply ignore the 3.4-eV peak in the initial stages of our 
investigation, and develop a model for the band structure without explicit 
reference to this peak. With our energy band model in hand, we will review 
relevant experimental evidence and draw our own conclusion concerning the 
nature of the 3.4-eV reflectivity peak. 

B. Energy band model 

In view of our satisfactory treatment of grey tin and germanium in terms of 
two-parameter empirical adjustments, we will also attempt a two-parameter 
treatment of silicon. Since the energies of the two most sensitive transitions in 
silicon are not as firmly established as they are in germanium and grey tin, we 
must now proceed somewhat differently. In silicon, the only key transition 
whose energy and symmetry assignment are beyond question is the indirect 
band gap, A7-r 25 <, where the superscript m denotes conduction band 
minimum. In order to bring out a number of important features of the band 
structure of silicon, we will adjust £(PERT) to the experimental indirect band 
gap, and to a series of assumed values for L x -L 2 ,. We have obtained £(PERT) 
solutions for several choices of L^-Ly. For each choice, a number of different 
two-parameter adjustments were carried out. The average £(PERT) solutions 
for three particularly instructive choices of £j-L 3 . are listed in Table 5. The 
individual fits rarely differed from their respective averages by more than 
+ 0.10 eV. Note that the £(PERT) entries in Table 5 are given to the nearest 
0.05 eV. 

It is obvious from Table 5 that insensitive transitions such as £ 3 -L 3 ., 
X x -X 4 , and r 15 -r 25 « are pinned down quite nicely by our adjustment 
procedure. Even if we could not specify the exact value of L 1 -L 3 ,, we would 
still have a fairly good idea of the magnitudes of the insensitive transition 
energies. For example, if we make the rather weak assumption that L x -L 2 . 
lies somewhere between 2.6 and 3.8 eV, we find that L 3 -£ 3 , = 5.0 + 0.15 eV; 
X x -X A = 4.05 + 0.15 eV; and most important of all, r is -r 25 - = 2.75 + 0.15 


TABLE 5 . Comparison of Theoretical and Experimental Transition Energies for Silicon' 
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eV. (These results can be obtained from Table 5 by tripling the excursion of 
L x -L y , and extrapolating the other transitions.) In short, there is no way of 
bringing ri5~r 25 ' close to 3.4 eV within the framework of any one of several 
two-parameter adjustments. 

Our earlier study of adjustment procedures (cf. Section IV,C) suggests that 
we can place considerable confidence in our predicted value for the insensitive 
transition ^15-^25'J but that the value of the sensitive transition T 2 ' f 25 ' 
is best determined by comparision of theory with experiment. In subsequent 
sections, we will examine a variety of experimental evidence, partly to check 
our theoretical predictions concerning the insensitive transitions, and partly 
to discover clues concerning the sensitive ones. In the remainder of the present 
section, we will compare our theoretical results with those of Cohen and 
Bergstresser (1966). We hope to compare our results with those of Cardona 
and Poliak (1966) in a separate publication. 

It is noteworthy that our empirical solution starts from a first-principles 
energy band model, £(NRSC), which is already in good qualitative agreement 
with experiment (cf. Table 5). In contrast, the empirical pseudopotential 
solution of Cohen and Bergstresser starts from the empty-lattice or free- 
electron gas model, which must be grossly distorted before it can approxi¬ 
mate the experimental energy level scheme of a nonmetallic crystal such as 
silicon. This distortion is illustrated (for germanium) in an earlier paper by 
one of the present authors (Herman, 1958). In practice, we can bring key 
features of £(NRSC) into agreement with £*(EXPT) by using empirical 
energy level shifts A£(AF) which are an order of magnitude smaller than their 
pseudopotential counterparts, as can be seen from Table 6 . Therefore, in 
the £(PERT) scheme, the uncertainties and ambiguities associated with the 
empirical adjustment are an order of magnitude smaller than they are in the 
£(PSEUDO) scheme. When one is attempting to fit theory to a limited amount 
of incisive experimental information—as is the case in silicon—our approach 
has obvious advantages over the pseudopotential approach. 

Our £(PERT) solution for silicon (£ 1 -L 3 - = 3.0 eV) is compared with 
Cohen and Bergstresser’s £(PSEUDO) solution in Table 5, and also in 
Fig. 4. The two solutions are clearly quite different in the vicinity of the 
conduction band states T 15 and r 2 -. Part of the reason for the difference is 
that Cohen and Bergstresser adjusted £(PSEUDO) to a r 15 -r 2 5 ' transition 
energy of 3.4 eV, in accordance with Phillip’s earlier “categorical” classifi¬ 
cation of principal energy levels in silicon and related semiconductors (Phillips, 
1962). 

In the following, we will consider the experimental evidence for ourselves, 
and explain why we believe there is no connection between the 3.4-eV peak 
and r 15 -r 25 ' or related transitions. We believe the correct value for T l 5 -r 2y 
is about 2.8 eV, the value given by our £(PERT) solution. 
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TABLE 6 

Comparison of Empirical Energy Level Shifts A£( A F) for Key 
Transitions in Silicon, Germanium, and Grey Tin 0 


Transition 

Silicon 6 

Germanium d 

Grey tin* 

Present 

work 

Pseudo- 

potential* 

Present 

work 

Pseudo- 

potentiaE 

Present 

work 

Pseudo- 

potential* 

ivr 2S . 

-0.4 

8.6 

-0.2 

7.3 

-0.4 

5.7 

r 2 -r 2s . 

0.55 

3.35 

0.7 

0.9 

—0.46 

-0.2 

r is - r 2 5> 

-0.25 

2.8 

-0.3 

2.7 

-0.3 

2.2 

XrXt 

-0.2 

4.0 

-0.1 

4.1 

-0.3 

3.4 

Lz '- Ls ' 

-0.4 

9.5 

-0.4 

9.0 

-0.3 

7.3 

Ll-Ly 

-0.25 

5.05 

-0.3 

5.3 

-0.3 

4.4 

Ly-Ly 

0.1 

3.0 

0.2 

2.1 

-0.3 

1.4 


° In the present work, AE(AF) is defined as £(PERT)-£(NRSC), where E(PERT) is 
intended to be a close approximation to £*(EXPT). Since all transition energies shown above 
are zero in the free-electron gas or empty-lattice model (cf. Herman, 1958), and since 
£(PERT) # £*(EXPT), the analogues of A£(AF) in the empirical pseudopotential method 
of Brust (1964) and of Cohen and Bergstresser (1966) can be represented by the correspond¬ 
ing values of E(PERT). In our approach, A£(AF)can be determined in most circumstances 
by first-order perturbation theory. In the Brust-Cohen-Bergstresser approach, the major 
portion of A£(AF) must be determined by solving high-order secular equations; the 
remaining portion can then be determined by perturbation theory. All entries are in electron 
volts. 

6 Based on Table 5, for Li-L 3 ' = 3.0 eV. 

c Here A£(AF) = £*(EXPT) — (free-electron gas transition energy) = £*(EXPT) — 0 « 
E(PERT). Note that in the free-electron gas model for diamond-type crystals, the valence 
band level r 25 ' and the conduction band levels T 15 , IY, and T x are all degenerate; 
similarly for X 4 and X x ; and similarly for L 3 >, L u L 3 , and the higher-lying conduction band 
level L 2 '. 

d Based on Table 1. Still smaller A£(AF) can be obtained by our method: cf. Table 3. 
(This remark also applies to other crystals.) 

e Based on Table 1. 

C. Piezoreflectance studies 

Using plane polarized light, Gerhardt (1965a,b) has measured the strain- 
induced changes in the reflectivity peak of silicon near 3.4 eV produced by 
uniaxial deformation along the [111] and [100] axes. For compression (or 
extension) along the [100] axis, the sign of the energy shift of the peak is 
different for light polarized parallel and perpendicular to the [100] axis, 
whereas compression along the [111] axis always shifts the peak to higher 
energies. On the basis of qualitative arguments, Gerhardt comes to the con¬ 
clusion that transitions clustered along the [100] axes rather than along the 
[111] axes are the main contributors to the 3.4-eV reflectivity peak. 
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Fig. 4. Comparison of three energy band models for silicon (spin-orbit splitting neglec¬ 
ted). One of our two-parameter £(PERT) models is represented by the solid lines, while 
another is represented by the dash-dot lines. Both of these models have been adjusted to the 
experimental indirect band gap, A7-T 25 . = 1.13 eV, but the former has been adjusted to the 
assumed value L l -L 3 ’ = 3.0 eV, while the latter has been adjusted to the assumed value 
h-Ly = 3.4 eV. In the interest of clarity, the dash-dot lines are not shown where they nearly 
coincide with the solid lines. The light dashed lines represent the three-parameter pseudopo¬ 
tential model of Cohen and Bergstresser (1966). The lower and upper shaded regions corre¬ 
spond to the initial and final state energy ranges associated with the 3.4-eV reflectivity peak, 
as given by the photoemission studies of Spicer and Simon (1962); see also Table 7. 

This conclusion is compatible with the results of earlier pseudopotential 
band calculations by Brust (1964), and with the results of the more recent 
pseudopotential calculations by Cohen and Bergstresser (1966), but not 
with our own results. Since the interpretation of Gerhardt’s measurements is 
open to question,* let us now turn to deformation potential measurements, 
which are considerably easier to interpret theoretically. 

* In principle, Gerhardt’s measurements can be interpreted in terms of a model which 
specifies (a) the geometrical shape and location in the reduced zone of the region primarily 
responsible for the 3.4-eV reflectivity peak; (b) the changes in shape and location of this 
region produced by uniaxial deformation; (c) the changes in the polarization-dependent 
interband matrix elements produced by uniaxial deformation. Although Gerhardt postulates 
a model for the 3.4-eV reflectivity peak and proceeds to interpret his data in terms of this 
model, it is not clear that his model is unique. His conclusions are not based on a set of 
selection rules which can be justified theoretically, but rather on a set of assumed selection 
rules. Unless his selection rules and other features of his model can be validated by further 
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D. Deformation potential studies 

Zallen’s experimental deformation potential value for the 3.4-eV reflec¬ 
tivity peak in silicon is — 5.0 +0. 5 eV (cf. Table 4). Referring to our theoreti¬ 
cal deformation potential values in Table 4, and to our theoretical transition 
energies in Table 5, we see that at the midpoint of the [100] axis, where 
A x -A 5 is 3.3 eV, the corresponding net deformation potential difference, or 
deformation potential for short, is -1.3 eV. If we move inward along the 
[100] axis from the midpoint to the zone center, A^As decreases from 3.3 to 
2.8 eV, while the deformation potential changes from —1.3 to —0.6 eV. If we 
reverse direction, and move outward along the [100] axis from the midpoint 
to the zone face, A x —A 5 increases from 3.3 to 4.0 eV, while the deformation 
potential changes from —1.3 to —1.4 eV. 

If we assume that the 3.4-eV reflectivity peak is due primarily to 
transitions, as Gerhardt suggests, we run into two difficulties: First, our A t 
and A 5 bands are not parallel in the region where their energy separation is 
about 3.4 eV, so that A x -A 5 transitions from this region would not make a 
large contribution to the joint interband density of states in the neighborhood 
of 3.4 eV. Second, our theoretical deformation potential value for this region 
is different by a factor of 4 from Zallen’s experimental value. 

The situation for the 3.4-eV reflectivity peak in silicon may well be similar 
to that for the 4.5-eV reflectivity peak in silicon (and also in germanium). 
As Kane (1966) has shown theoretically, the 4.5-eV reflectivity peak in silicon 
is produced by transitions associated with an extended region of the reduced 
zone, rather than by transitions located exclusively in the neighborhood of the 
X point. This accounts, for example, for the difference between the experi¬ 
mental deformation potential value for the 4.5-eV silicon peak, £)(EXPT) = 
— 2.8 + 0.5 eV, and our theoretical value for the X x -X± transition, 
D(NRSC) = -1.4 eV (cf. Table 4). 

It is quite possible, and in fact likely, that the 3.4-eV reflectivity peak in 
silicon is also due to transitions associated with an extended region of the 
reduced zone, though this region would be less extended than that associated 
with the 4.5-eV reflectivity peak, which is considerably stronger. If this conjec¬ 
ture is correct, the extended region for the 3.4-eV peak should coincide with 
a region in the zone where the deformation potential is close to the experi¬ 
mental value for this peak, —5.0 + 0.5 eV. 

theoretical analysis, Gerhardt’s conclusions must be regarded as tentative. The situation 
here may be analogous to the interpretation of magnetoresistance data for p-type germanium 
on the basis of a many-valley valence band model. An analysis of such experimental data in 
terms of this model would point to [111] valence band maxima, but this is not the correct 
model. In order to interpret the data properly, it is necessary to begin with warped or fluted 
constant energy surfaces, rather than with disconnected clusters of ellipsoidal constant 
energy surfaces. 
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We have attempted to visualize the 3.4-eV region by studying the defor¬ 
mation potentials and interband transition energies at the points listed in 
Tables 4 and 5, and at a few other key points. So far as we can tell, the 
region of interest includes a section near the [110] axis about £ of the way 
from T to the zone boundary, as well as a cylindrical section which surrounds 
the [111] axis most of the way from L to T. It is difficult to visualize the precise 
location and form of this region without carrying out a detailed survey of the 
reduced zone in the manner of Brust (1964, 1965). We are planning to under¬ 
take such a survey in the near future, but we expect to be able to map out the 
constant energy (and deformation potential) profiles in the reduced zone more 
efficiently by utilizing recently developed visual display techniques (cf. Wahl, 
1966). 

E. Photoemission studies 

Vital evidence concerning the absolute energies of the initial and final 
states of the interband transitions giving rise to the 3.4-eV reflectivity peak in 
silicon is provided by the photoemission studies of Spicer and Simon (1962). 
This evidence is obtained by studying the energy distribution of the photo- 
emitted electrons as a function of the energy of the exciting photons, and also 
by comparing the spectral distribution of the photoelectron quantum yield 
with the optical reflectivity spectrum. 

Briefly, Spicer and Simon find that 3.4-eV incident photons produce a broad 
peak in the energy distribution of the photoemitted electrons centered 1.0 + 0.2 
eV above the vacuum level, while 5.3-eV incident photons produce a broad 
peak centered 2.3 + 0.2 eV above the vacuum level. These two peaks can be 
correlated with the 3.4- and 5.3-eV peaks which occur both in the optical 
reflectivity spectrum and in the photoelectron quantum yield spectrum. The 
experimental evidence is consistent with the view that the same two sets of 
initial valence band states and final conduction band states play a dom inant role 
in determining the 3.4- and 5.3-eV reflectivity peaks on the one hand, and the 
1.0 + 0.2 and 2.3 + 0.2 eV photoelectron peaks on the other. Since most of 
the photoemission takes place from a region of the crystal in which the highest 
valence band level lies 1.5 eV below the vacuum level, it can be inferred that 
the two photoelectron peaks lie 2.5 ± 0.2 and 3.8 + 0.2 eV above r 25 '. 
According to Spicer (private communication), there is good evidence that the 
photoelectrons first make a vertical transition from valence band to conduc¬ 
tion band states, and then are emitted from the crystal without any appreciable 
loss of energy. It follows that the two final state energies are 2.5 + 0.2 and 
3.8 +0.2 eV, while the corresponding initial state energies are 2.5 ± 0.2 — 
3.4 = — 0.9 ± 0.2 and 3.8 + 0.2 — 5.3 = —1.5 ± 0.2 eV, all relative to T 25 ,. 

In Table 7, Spicer and Simon’s initial and final state ehergies for the two 
sets of transitions are compared with relevant features of our energy level 
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scheme (cf. also Fig. 4). The higher energy peak can be identified with transi¬ 
tions closely related to the L 3 -L 3 -transition. Sincethe(so-called) 5.3-eV optical 
reflectivity peak is quite broad, it is likely that an extended region near the L 
point is associated with this peak; this is indeed indicated by Kane’s (1966) 
calculations. There may also be a significant contribution to the 5.3-eV 
peak from A 2 ,-A 5 transitions near the [100] axis midpoint. The breadth of 
this peak may also be related to the presence of these additional transitions. 

TABLE 7 

Spicer and Simon’s Photoemission Results for Silicon" 


Energy level scheme for Silicon 

Spicer and Simon' 

: (experiment) 

Initial state 

Final state 

E( PERT) 6 

3.4 eV peak 

5.3 eV peak 


Near L 3 



3.8 ± 0.2 


l 3 

3.85 ± 0.05 




IV 

3.8 ± 0.4 




r 15 

Extended 

2.75 ± 0.05 

2.5 ±0.2 



Ly 

2.05 ± 0.2 



r 25 ' 

Extended 


0.0 

- 0.9 ± 0.2 


l 3 . 

Near L 3 - 


- 1.15 


- 1.5 ± 0.2 d 


a All entries are in electron volts. 

* Based on Li-L 3 - = 3.2 ± 0.2 eV. See Table 5 for further details. 

c From Spicer and Simon (1962). 

d If the interband transition energy is taken as 5.1 rather than 5.3 eV, this entry would 
read: — 1.3 ± 0.2 (eV). 

More important for our immediate purpose, however, is the identification 
of the lower energy peak. With the initial state energy pinned down to the 
range —0.9 + 0.2 eV, transitions at or very near the center of the zone are 
ruled out as principal contributors to the 3.4-eV reflectivity peak. This range 
could probably be stretched out slightly, say to -0.9 + 0.3 eV, but this 
would still exclude T and its neighborhood. Therefore, the association of 
ri 5 _r 25 ‘ or closely related transitions with the 3.4-eV reflectivity peak 
appears to be ruled out not only by the deformation potential evidence, but 
also by the photoemission evidence. The region in the reduced zone compatible 
with the initial state energy range —0.9 ± 0.2 (or 0.3) eV includes the [110] 
section and the cylindrical section surrounding the [111] axis already suggested 
by the deformation potential evidence. 

For further discussions of photoemission studies of the band structure of 
silicon, the reader is referred to papers by Brust (1965), Cohen and Phillips 
(1965), and Allen and Gobeli (1966). 
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Having stated our reasons for believing (a) that the 3.4-eV reflectivity peak 
is not associated with 5“ r 25' or closely related transitions, and (b) that this 
peak is associated with interband transitions belonging to an extended region 
in the reduced zone, most likely a region encircling the [111] axis, and even 
extending as far as the [110] axis, our discussion of the 3.4-eV reflectivity peak 
in silicon is concluded. In the next section, we will consider some experimental 
evidence which may help us pin down one of the sensitive transitions (r 2 <-r 2 5 ' 
or Li-Ly) and thereby eliminate the residual uncertainty in our energy band 
model for silicon (cf. Table 5 and Fig. 4). 

F. Electroreflectance Studies 

Before turning to the electroreflectance studies of Seraphin (1964, 1965), 
we wish to remind the reader that the standard reflectivity spectrum is 
closely related to the joint interband density of states, and that under ideal 
conditions characteristic features in the reflectivity spectrum can be related to 
critical points (Phillips, 1966). Shoulders or edges are identified with M 0 or 
M 3 critical points (parabolic band edges), while peaks are identified with 
pairs of M t , M 2 critical points (saddle-point band edges). In practice, how¬ 
ever, a reflectivity peak may be produced by a set of interband transitions 
whose high density is not due to a specific critical point, but rather to an 
extended region of the reduced zone. Such a reflectivity peak may or may not 
be accompanied by a corresponding electroreflectivity peak (ER-peak): 
The electroreflectivity spectrum is not so much a measure of the joint 
interband density of states as it is a measure of the change in this density 
produced by an electric field. Since electric-field induced changes in the joint 
density of states are singular at critical points, critical points show up more 
clearly in electroreflectivity spectra than they do in standard optical spectra. 
It is for this reason that electroreflectance studies, pioneered by Seraphin 
and now being vigorously pursued by the Brown University group under 
Cardona, are expected to play a prominent role in future band structure 
determinations. 

Using high-resolution modulated electroreflectivity techniques, Seraphin 
(1965) has resolved three ER-peaks within a 0.1-eV range near the 3.4-eV 
reflectivity peak in silicon. The height of the ER-peak at 3.34 eV is indepen¬ 
dent of the crystalline orientation of the reflecting surface, while the height of 
the ER-peak at 3.45 eV is markedly dependent on crystalline orientation. On 
the basis of these orientation effects, he assigns the 3.34-eV ER-peak to a 
critical-point transition at the central point of the zone, and the 3.45-eV 
ER-peak to a critical-point transition elsewhere in the zone. He has also 
studied the temperature dependence of the two ER-peaks already mentioned, 
as well as that of the third ER-peak, which lies between the other two at lower 
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temperatures, but disappears at higher temperatures. In view of its pro¬ 
nounced temperature dependence, this intermediate ER-peak has been 
attributed to an exciton. When the average electric field in the surface is 
changed by varying the dc bias, the 3.34- and 3.45-eV ER-peaks shift in 
energy in opposite directions. According to the duality theorem (Phillips and 
Seraphin, 1965), these opposite spectral shifts indicate that one ER-peak must 
be assigned to a parabolic band edge, where the band separation is an extre¬ 
mum, while the other ER-peak must be assigned to a saddle-point band edge, 
where the gradient of the band separation is zero. 

Combining the crystal orientation and spectral shift deductions, Seraphin 
concludes that the 3.34-eV ER-peak is associated with a critical-point 
transition at a parabolic band edge at the central point of the zone, and that 
the 3.45-eV ER-peak is associated with a critical-point transition at a saddle- 
point band edge elsewhere in the zone, possibly along the [111] axis. 

As Seraphin himself emphasizes, current interpretations of electroreflec¬ 
tivity spectra should be treated with caution, since our theoretical ability to 
analyze spectral information is still rather limited. For example, Seraphin’s 
analysis of his spectral data does not show up the spin-orbit splitting in the 
valence band, which is 0.044 eV at the center of the zone, but smaller else¬ 
where. Also, it appears to be difficult to understand the temperature depen¬ 
dence of the various ER-peaks. Therefore, the above conclusions reached by 
Seraphin should be regarded as provisional. 

If one or more of the ER-peaks near 3.4 eV is ultimately identified with a 
critical-point transition at the center of the zone, we would expect this transi¬ 
tion to be r 2 '-r 25 ', rather than r x 5 -r 25 ', since our two-parameter £(PERT) 
model places r 15 -r 25 - at about 2.8 eV whatever we choose for L i -L 3 > within 
reasonable limits (cf. Table 5). We have anticipated the possibility that 
r 2 '-r 25 ' = 3.35 + 0.05 eV by pegging one of our three EXPERT) models to 
this value. This model, which corresponds to the choice L t — L y = 3.0 eV, is 
displayed in Table 5 and in Fig. 4 as well. We have also anticipated the possi¬ 
bility that one or more of the ER-peaks near 3.4 eV will ultimately be identified 
with a critical-point transition somewhere along (or near) the [111] axis by 
using L l -L 3 ' — 3.4 eV as the basis for another £(PERT) model: this is also 
displayed in Table 5 and Fig. 4. 

However, we can definitely rule out the possibility that r 2 '-r 25 ' and 
L 1 -L 3 - are 3.4 eV simultaneously: Setting up a three-parameter £(PERT) 
model based on Au(l 11), Ar(220), and Ai>(311), and adjusting these three para¬ 
meters to A?-r 25 . = 1.13 eV, r 2 ,-r 25 - = 3.4 eV, and L t -L y = 3.4 eV, we 
obtain totally unacceptable values for the other key transitions, e.g., r 15 -r 25 - 
- 0.06 eV; X t -X 4 = 1.3 eV; andL 3 -L 3 - = 2.8 eV. 

It is unfortunate that no definite conclusions concerning the proper assign¬ 
ments of the 3.34- and 3.45-eV ER-peaks can be drawn at this time. There is 
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good reason, however, to believe that future theoretical analyses of Seraphin’s 
measurements will lead to definite assignments for these ER-peaks, and, 
hopefully, to definite information concerning one or another of the sensitive 
transitions. 

Recent theoretical work by Aspnes (reported by P. Handler, invited paper 
Durham Meeting of the American Physical Society, March 31, 1966) indicates 
that the duality theorem (Phillips and Seraphin, 1965) has only limited appli¬ 
cability, and does not provide a clear-cut basis for interpreting electroreflec¬ 
tivity spectra in general. The transition energy in the neighborhood of a 
critical point can be expanded in a Taylor series (E vs. k) provided the initial 
and final states are nondegenerate. The three reduced masses of an M 0 critical 
point (in a principal axis coordinate system) are all positive; one or two of 
these three masses is (are) negative for M x or M 2 critical points; and all three 
of these masses are negative for an M 3 critical point. According to the work of 
Aspnes, the electroreflectivity spectrum associated with a saddle-point band 
edge (Mi or M 2 ) can be decomposed into longitudinal and transverse com¬ 
ponents, corresponding to the electric field being oriented parallel and per¬ 
pendicular to the odd sign reduced mass. The duality theorem of Phillips and 
Seraphin relates the spectral shift at a parabolic band edge to the longitudinal 
but not to the transverse spectral shift at a saddle-point band edge. This omis¬ 
sion, which has been remedied by Aspnes, invalidates the arguments that 
Phillips and Seraphin have used in interpreting Seraphin’s spectral shifts. By 
obtaining a more general solution for the interrelationships among ER-peaks 
associated with different types of critical points, Aspnes (1966) has paved the 
way for future theoretical interpretations of measurements such as Seraphin’s. 


VI. Germanium-Silicon Alloys 

Considerable insight into the nature of the energy band structure of pure 
germanium and pure silicon can be gained by a careful study of the composi¬ 
tion dependence of the optical reflectivity spectrum of the germanium-silicon 
alloy system. According to the measurements of Tauc and Abraham (1961), 
the 2.2-eV reflectivity peak in germanium—this is actually a 2.1, 2.3 eV 
doublet—shifts to higher energy as silicon is added to germanium. In the 
range from 0 to 35 at. % silicon, the doublet splitting decreases linearly from 
0.2 to 0.1 eV. The doublet cannot be resolved above 35 at. % Si, but the unre¬ 
solved peak can be followed experimentally up to about 55 at. % Si, where it 
lies at about 2.9 eV. In the range between 75 and 100 at.% Si, there is 
another reflectivity peak which shifts from 3.25 to 3.4 eV in this range. If one 
extrapolates the composition dependence of the energy of the (2.2 eV) 
germanium peak, this line passes through the 3.25-eV experimental point at 
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Fig. 5. Composition dependence of the energy of the first reflectivity peak in the 
germanium-silicon alloy system, according to Tauc and Abraham (1961). In the 
germanium-rich alloys, this peak is actually a spin-orbit split doublet. 


75 at. % Si. However, it seems more reasonable to associate the 3.25-eV point 
with the 3.4-eV silicon peak than with the 2.2-eV germanium peak (cf. Fig. 5). 
In any event, there is a sharp break in the curve giving the composition depen¬ 
dence of the energy of the first reflectivity peak at 75 at. % Si, suggesting that 
the 3.4-eV silicon peak is unrelated to the 2.2-eV germanium peak. This is 
reminiscent of the sharp break in the composition dependence of the funda¬ 
mental absorption edge in the germanium-silicon alloy system at 15 at.% Si 
which one of us explained some time ago (Herman, 1954). 

We would now like to offer an interpretation of the sharp break in the Tauc 
and Abraham curve in terms of our energy band models for germanium and 
silicon, which are shown superimposed in Fig. 6. It will be noted that certain 
energy band profiles appear virtually identical in both crystals, while others 
shift drastically between germanium and silicon. In particular, the 
L 3 -A 3 -r 15 -A 1 -A’i-Z 1 conduction band profile remains nearly stationary, 
while the Ti-A 1 -r 2 <-A 2 ,-A'i-Z 3 conduction band profile shows the greatest 
change. 

There is an obvious correlation between the magnitude of the deformation 
potential and the magnitude of the chemical shift (alloy effect) for the various 
interband transitions, but we will not pause to discuss this correlation further. 

As is known from Brust’s detailed studies (Brust, 1964), the 2.2-eV reflec¬ 
tivity peak in germanium is associated with a Ai~A 3 critical-point transition 
in the neighborhood of the A t arch (see also Cardona and Poliak, 1966). 
As silicon is added to germanium, the 2.2-eV peak obviously shifts to higher 
energies because of the upward motion of A t , and the doublet splitting 
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Fig. 6. Energy band structure of the germanium-silicon alloy system, obtained by 
superimposing the present energy band models for germanium (cf. Fig. 1) and silicon 
(cf. Fig. 4). As silicon is added to germanium, the germanium energy band profiles (shown 
dashed) sweep through the shaded regions, and approach the silicon energy band profiles. 
The latter are shown by solid lines, corresponding to the assumption that L x -Ly = 3.0 eV, 
and by dash-dot lines, corresponding to the alternate assumption that Z.i-Z. 3 - = 3.4 eV. In 
the interest of clarity, the dash-dot lines are not shown where they nearly coincide with the 
solid lines. 

obviously decreases because the spin-orbit splitting in the A 3 valence band 
decreases. As the L 1 -A 1 -r 2 < conduction band profile rises in energy, a 
composition is reached at which the T 2 < level crosses the nearly stationary 
r 15 level. The lower A x conduction band, formerly attached to T 2 <, now 
becomes attached to T 15 and rises no higher, while the upper conduction 
band, formerly attached to T 15 , now becomes attached to T 2 < and begins to 
rise with T r . 

The sharp break in the Tauc and Abraham curve at 75 at. % silicon is un¬ 
doubtedly related to the changes in the conduction band structure away from 
the center of the zone induced by the cross-over of T 2 - and T 15 at a compo¬ 
sition not far removed from 75 at. % silicon. Below this critical composition, 
the spectral shift of the first reflectivity peak is large because the upward 
motion of the A t conduction band is unimpeded by the anchoring of this band 
to T 15 at the zone center. Above this critical composition, the spectral of 
this peak is small because the upward motion of the A t conduction band is 
arrested by the anchoring at the zone center. It would not be too surprising if 
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this anchoring produced a significant change in the shape and location of the 
region in the reduced zone which contributes most heavily to the first re¬ 
flectivity peak. While the region associated with the first reflectivity peak 
may well have a different shape and location in pure germanium and pure 
silicon, we would expect this region to change continuously, if not uniformly, 
as we pass through the critical composition in the neighborhood of 75 at. % 
silicon. 

We have recently learned (M. Cardona and F. H. Poliak, private communi¬ 
cation) that electroreflectivity studies of the germanium-silicon alloy system 
are now in progress. It will be interesting to see whether the cross-over of 
T 2 . and r 25 . can be observed. It will also be interesting to see whether the 
Aj—A 3 critical-point transition, which shows up so clearly in the electro¬ 
reflectivity spectrum of pure germanium (Seraphin, 1964), can be traced 
through the entire composition range from pure germanium to pure silicon, 
and how the composition dependence of this transition compares with that of 
the first reflectivity peak. 

Many of the questions raised by the present paper would be resolved if the 
r 15 -r 25 . transition could be detected (and unambiguously identified) in pure 
germanium, in pure silicon, or in any of the intermediate germanium-silicon 
alloys. At present, it is not clear whether r 15 -T 25 < —like r 2 '-r 25 ' in germa¬ 
nium—can be detected by electroreflectivity measurements, or whether 
r i 5 -r 25 - —like L x -L 3 ' in germanium—cannot be detected by such measure¬ 
ments (Seraphin, 1964, 1965). (See Note added in proof.) 


VII. Concluding Remarks 

We have developed a method for determining the energy band structure of 
crystals which is designed to be more reliable than purely first-principles or 
purely empirical (pseudopotential or k • p) methods. Superior accuracy is 
attained by adding a small, carefully chosen empirical correction to an other¬ 
wise first-principles band calculation. By carefully examining the nature of 
empirical corrections, we have learned how to construct optimally flexible 
adjustment schemes based on a minimum number of adjustable parameters, 
and how otherwise to enhance the effectiveness and reliability of our small 
empirical correction (cf. Table 2). Relativistic and spin-orbit coupling correc¬ 
tions can be taken into account by a first-principles calculation, by an empirical 
correction, or by a combination of the two. In practice, our overall energy 
band solution is based almost entirely on first-principles considerations, and 
only slightly on empirical considerations. 

At present, our first-principles band calculations are based on Slater’s free- 
electron exchange approximation, though we have experimented with other 
exchange approximations (cf. Table 3). Although we are attempting to improve 
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our treatment of exchange effects by going beyond the free-electron exchange 
approximation and variations thereof, the ultimate accuracy of our method is 
not critically dependent on the exact choice of exchange approximation: in 
practice, shortcomings in our treatment of exchange (and correlation) effects 
are largely compensated by the empirical correction. 

Our motivation for developing an empirically perturbed self-consistent field 
method may be summarized as follows. First, we recognized that it was 
impractical to treat exchange, correlation, and relativistic effects rigorously, 
and that some form of empirical correction was necessary if the difficulties 
associated with treating these effects rigorously were to be circumvented. 
Second, we recognized that purely empirical methods were not nearly as 
reliable as some of their advocates would have us believe (cf. Phillips, 1966), 
and that it would be far better to add a small empirical correction to a physi¬ 
cally reliable first-principles band calculation than to depend entirely on the 
caprices of an empirical adjustment (cf. Table 6). Third, we recognized that 
the small empirical correction required to bring our first-principles theory into 
agreement with experiment was in effect a quantitative measure of the short¬ 
comings of our first-principles band calculations, and that we might be able to 
eliminate at least some of these shortcomings if we could study them systema¬ 
tically, and on a quantitative basis. In fact, we have already discovered a 
number of ways for improving our first-principles band calculations and con¬ 
currently reducing the magnitude of the empirical correction required to bring 
theory and experiment into agreement. Some of these ways have already been 
incorporated in the present work, while others are still being explored with a 
view to future incorporation. (See also Herman et a/., 1966.) 

Instead of attempting to fit our theoretical energy band models to a mixture 
of precisely and imprecisely known experimental information, as is the 
common practice in purely empirical band calculations (Brust, 1964; Cohen 
and Bergstresser, 1966; Cardona and Poliak, 1966), we have adjusted our 
band models only to the most firmly established experimental information, 
namely, the direct and indirect band gaps in grey tin and germanium, and the 
indirect band gap in silicon. By using two-parameter adjustment schemes, we 
were able to obtain unique solutions for grey tin and germanium, and a one- 
parameter family of possible solutions for silicon. Many of the essential 
features of the silicon band structure, namely, the insensitive transitions, are 
actually pinned down quite nicely by this type of solution. In order to pin 
down the sensitive transitions, it is necessary to make a further appeal to 
experiment. Even though definite conclusions have not yet been reached 
concerning the interpretation of recent electroreflectivity measurements 
(Seraphin, 1965), we have anticipated two possible interpretations by paying 
special attention to two energy band models which correspond to these inter¬ 
pretations. 


New Studies of the Band Structure of Silicon, Germanium, and Grey Tin 425 


In addition to obtaining new energy band solutions for silicon, germanium, 
and grey tin (cf. Tables 1 and 5, Figs. 1 and 4), we have calculated deformation 
potentials for most of the key interband transitions in silicon and germanium, 
and compared these calculated values with the measured values for supposedly 
related optical reflectivity peaks (cf. Table 4). Such comparisons show that the 
key X 1 -X 4 transition is not really representative of the main reflectivity peak, 
even though it is commonly identified as such (Phillips, 1962, 1964, 1966). 
This particular conclusion has also been reached by Kane (1966) on the basis 
of a careful analysis of the joint interband density of states. 

We have also determined the changes in the band structure of germanium 
produced by major as well as minor changes in the lattice constant (cf. Figs. 2 
and 3), and examined the composition dependence of the band structure of 
the germanium-silicon alloy system (cf. Fig. 6). 

By fitting our theoretical models only to well-established experimental infor¬ 
mation, thereby freeing ourselves from previous speculations (Phillips, 1962, 
1964, 1966) concerning the nature of the band structure away from the band 
edges, we have been able to take a fresh look at the entire band structure, 
including such poorly understood regions as the conduction band structure 
associated with the triply-degenerate state r 15 . Since r 15 -r 25 ' is an insensitive 
transition, we have been able to pin down this transition in silicon, as well as in 
germanium and grey tin. In all three of these crystals, our estimates for the 
Tj 5 -r 2V transition energy are consistently lower than previous estimates (e.g., 
Cohen and Bergstresser, 1966; Cardona and Poliak, 1966) by at least 0.5 eV. 
These revised estimates lead to significant changes in the band structure, as 
can be seen from Figs. 1 and 4. 

Since the disposition of three of the four lowest conduction bands in the cen¬ 
tral region of the reduced zone is determined by r 15 -r 25 -, the value chosen for 
this transition has an important bearing on theoretical interpretations of ordin¬ 
ary reflectivity spectra, electroreflectivity spectra, piezoreflectivity spectra, and 
photoemission spectra, particularly in the range between 2 and 4 eV, where 
r 15 -r 25 < and related transitions are most prominent. Our present work casts 
doubt on theoretical interpretations of such spectra based on previous esti¬ 
mates of r 15 -r 25 -, and suggests that it would be desirable to reinterpret 
these spectra in the light of our estimates for this transition. 
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Note Added in Proof 

Ghosh (1966a) has recently remeasured the electroreflectivity spectrum of 
germanium using the electrolytic technique. In addition to peaks at 2.12 and 
2.32 eV which he attributes to the A x -A 3 doublet, Ghosh also observes 
structure at 2.05 and 2.24 eV which he attributes to the L^—Ly doublet. The 
center of gravity of the 2.05, 2.24 eV doublet is 2.15 eV, which is in excellent 
agreement with our estimate, 2.1 +0.1 eV. Ghosh also observes structure 
between 2.6 and 2.9 eV which is not well resolved but otherwise quite definite. 
From our point of view, this last observation is very encouraging, since 2.6 
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to 2.9 eV is just the spectral range where we believe the controversial Tj 5 -r 25 ' 
transition lies. Our best estimate for the center of gravity of the spin-orbit 
split r 15 -r 25 ' multiplet is 2.8 eV (cf. Section IV,C). It would be premature, 
of course, to suggest that Ghosh has actually confirmed our prediction that 
T 15 -T 25 , is about 2.8 eV. Ghosh has observed two peaks at 5.35 and 5.52 
eV which may correspond to members of the spin-orbit split L 3 -L 3 - 
multiplet, whose center of gravity we estimate to be about 5.4 eV (cf. 
Table 1). Ghosh (1966b) has also remeasured the electroreflectivity spectrum 
of silicon using the electrolytic technique, but his experimental curve falls just 
short of the 2.7 to 2.8 eV region where we believe the silicon r 15 -r 25 ' tran¬ 
sition lies. 
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I. Introduction 

It was roughly thirty years ago that the essentials of energy band theory were 
worked out. However, only in recent years has computer technology and the 
preparation of single crystal materials of controlled composition advanced to 
the point that can now match the imagination of the band theorist. Indeed, 
even though refinements in technique, investigation of self-consistency and 
correlation effects are still receiving considerable attention, one can now 
regard an energy band analysis as an incisive tool to be used in understanding 
the electrical, optical, and magnetic properties of a material or family of 
materials. Of all the various approaches suggested to the energy band problem, 
only two seem to have survived. First, there is the augmented plane wave or 
APW method devised by Slater (1937) and carefully investigated and applied 
by many of his students. Secondly, there is the orthogonalized plane wave or 
OPW method developed by Herring (1950) and more recently in several 
works by Kleinman and Phillips (1959). Both methods have their advantages 
and difficulties. However, our purpose here is to focus, not on the details of 
band calculation, but rather on the capabilities of band theory in materials 
research. Thus we will play the “ materials game ” which consists of the 
following four parts. 
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A. Select a material or family of materials with exciting physical and 
electronic properties. 

B. Carry through an APW analysis and determine the Bloch functions and 
energies throughout the Brillouin zone. 

C. Apply the theoretical model found to the entire gamut of physical 
properties. 

D. Try to use the deduced model to predict new physical effects. 

II. “Materials Game” 

A. Selection of a family of materials 

The choice of the lead salts for a detailed energy band analysis is based on 
the remarkable physical properties of these materials. PbTe has for many 
years been among the most efficient of thermoelectric materials and it 
sparked the hope of direct, practical conversion of heat into electric power or 
of electronic cooling. One measures in these lead salts some of the highest 
carrier mobilities of any known materials. The static dielectric constants are 
very large with the value for PbTe being reported from 400 to several thousand. 
The lead salts have long been important as infrared detectors and recently 
they have been operated as lasers both as semiconductor diodes and by 
direct optical pumping. Finally, it is believed that under some circumstances 
PbTe may have a superconducting phase. Clearly this family of materials 
must be regarded as an interesting challenge to the energy band method. 

B. The energy band determination 

Relativistic Effects on the Band Structure. We consider here the question 
of what dynamical interactions are to be included in the one-electron Hamil¬ 
tonian from which the Bloch functions 4>„{k, r) and band energies e„(k) 
are determined. This is a straightforward question in so far as we can use 
an energy gap as a yardstick of importance. Any interactions comparable to 
e g must, of course, be retained and including those small in relation to e g will 
very likely be of no physical interest. It is a well-known fact that spin-orbit 
effects in solids can be very important if the atomic number is large. In the 
case of Ge with atomic number 32, the spin-orbit splitting of the valence band 
at k = 0 is 0.30 eV which is to be compared with the direct gap of 0.89 eV. 
Since this interaction is a strongly increasing function of atomic number, it is 
immediately obvious that spin-orbit effects will be absolutely essential to the 
description of the lead salts since z Pb = 82 and the energy gap in this set of 
materials is of the order of 0.2 eV. 

Since we are dealing with an element of such a high atomic number, let 
us consider the derivation of the spin-orbit interaction as usually encountered 
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in atomic or solid-state problems in a two-component scheme from the original 
four-component relativistic theory. The purpose is to ask if the derivation 
gives rise to any other interactions that might be discarded in a theory aimed 
at explaining the spin-orbit splitting of atomic levels, but which could play 
an essential role in a band analysis. As will be explained below, it was originally 
found by Johnson et al. (1963) during the investigation of PbTe that the long 
known but universally overlooked mass-velocity and Darwin corrections to 
the energy were of enormous importance. 

The mass-velocity correction can be discussed from the following simple 
point-of-view. Consider an electron associated with an atom in some state 
of fixed energy e which we will write simply as 



0 ) 


In order for the energy £ of the electron to remain constant when it is near 
the nucleus the kinetic energy must become large enough to balance the 
singular nuclear attraction ze 2 /r which completely dominates the potential 
due to other electrons V el (r). We can define a “ relativistic region ” about a 
nucleus by a radius R 0 such that 



( 2 ) 


which for Pb turns out to be approximately R 0 = 10~ 12 cm. It is an interesting 
fact that an electron spending only a small fraction of its time inside this 
relativistic region suffers a substantial change in energy. However, only 
those Bloch functions with some 5 -like character will have a significant charge 
density at such small distances from a nuclear site. Furthermore, the 5- 
character of an energy band depends on the location in k-space, e.g., a band 
could have pure 5 -character at one symmetry point, but lose it completely at 
another. Therefore, the relativistic corrections are k-dependent and will 
contribute not only to the value of the band separations but to the effective 
masses as well. Let e be written near the nucleus as 


ze 


2 


[1 - (v 2 /c 2 )] 


,2//.2M1/2 


( 3 ) 


r 


expanding the square root and using the relativistic relation between mo¬ 
mentum and velocity one finds 
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The final form of the one-electron Hamiltonian obtained from the Dirac 
equation by two successive applications of the Foldy-Wouthuysen transforma¬ 
tion is in atomic units 

= - V 2 + V - (ia 2 )/? 4 + joc 2 V 2 + ioc 2 a • [V Vxp]. (5) 

An excellent account of the mass-velocity and Darwin interactions will be 
found in Bethe and Salpeter (1957) and a discussion of the derivation of Eq. 
(5) for a many-electron system is given by Pratt (1963). 

The relativistic corrections were included in the band analysis of the lead 
salts by first finding the Bloch functions in terms of APW’s and the corre¬ 
sponding energies for the nonrelativistic Hamiltonian 0 = — V 2 + V. Then 
the basis functions for the double group states were constructed from these 
states and the matrix elements of the full relativistic Hamiltonian calculated 
and this matrix diagonalized. A detailed discussion is given by Conklin et al. 
(1965). Both the valence band maximum and conduction band minimum 
occurs at the nja (111) or k — L edge of the Brillouin zone. In Figure 1 the 



Fig. 1. Energy band results at the (111) zone edge for PbTe. 

results at k = L are shown for 0 , then 3^ 0 plus the mass-velocity and Darwin 
corrections, and finally including spin-orbit coupling. Note how the L v band 
has a major change in energy due to the mass-velocity and Darwin terms 
while the L1 band is less affected and the Ly bands still less. This is due to the 
fact that only the L x band has s-Iike symmetry about the Pb sites, while the 
LJ is s-like only at the Te sites and the L* have no ^-character at all. 

It is quite clear that agreement of the band structure with experiment 
would be hopeless if the relativistic corrections were ignored. Since the dis¬ 
covery of their marked effect in PbTe they have been included in the* band 
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investigations of several other cases. Herman et al. (1963) have estimated 
their importance in several families of materials and Loucks (1965) has set 
up the energy band problem on a four-component basis, essentially solving 
the Dirac equation in a solid. 

C. Comparison with experiment 

The only ambiguous feature of the APW method is the choice of sphere 
radii surrounding the constituent atoms and the value of the constant potential 
in the region between the spheres. In fact by varying the constant potential 
it is possible to make considerable changes in the energy gap, even reducing it 
to zero. Only the lowest conduction band at L of PbTe exhibited a strong 
sensitivity and the constant potential was determined by trial and error in 
order to fit the gap. This represents the only use of experimental information 
in an otherwise ab initio investigation. A detailed presentation of the APW 
results is given by Conklin et al. (1965) and Rabii (1966). Not only does one 
obtain a semiconductor, but the energy gap occurs at the correct place in 
A:-space, i.e., k = L as dictated by experiment. However, band theory can give 
a great deal more information than this. Applying k*p perturbation 
theory Kane (1956) and Roth (1960) yields the effective mass tensors and 
g-factor tensors for the various bands. This perturbation scheme was used 
on PbTe by Pratt and Ferreira (1964) and by Rabii (1966) for PbSe and PbS 
and the results are shown in Table 1. 


TABLE 1 

A Comparison Between Theory and Experiment for the Lead Salts” 



m t 


m e 



gll 


PbTe^ 

f 0.031 

(0.024) 

0.238 

(0.24) 

-29.16 

(~ -47.0) 

Conduction 

\ 0.034 

(0.022) 

0.426 

(0.31) 

31.36 

(~ +47.0) 

Valence 


t 

t 







theory 

experimental 







1 

1 






PbSe* 

f 0.064 

(0.040) 

0.081 

(0.070) 

-11.57 

(~ +37.0) 

Conduction 

\ 0.068 

(0.034) 

0.095 

(0.068) 

21.06 

(~ +37.0) 

Valence 

PbS* 

/ 0.150 

(0.080) 

0.183 

(0.105) 

-5.20 

-11.5 

Conduction 

V 0.203 

(0.075) 

0.288 

(0.105) 

6.82 

7.0 

Valence 


” The experimental values are given in parentheses beside the corresponding results. 


In addition, since the k.p results are based on interband momentum matrix 
elements, one can predict the oscillator strength and relative polarization of 
the interband optical transitions. This will be applied below to the properties 
of Pb salt lasers. 
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One must view these results with some satisfaction. Although the effective 
masses tend to be too heavy and the ^-factors too small, these numerical 
results come from a very sensitive calculation and there is certainly good 
overall agreement with experiment. There is another side to this coin. Let us 
imagine that theoretical considerations indicate that some material should have 
interesting electronic or optical properties. In addition let this be a material 
for which there are no suitable single crystals available. Then instead of being 
forced to develop the preparation techniques and embark on an experimental 
program, a band analysis can be completed in a matter of weeks and at very 
small expense that will be able to predict with a very reasonable degree of 
accuracy the results of such measurements. 

These numerical values for the masses and g-factors obtained from the APW 
work on the Pb salts represent the present day “state of the art.” Lin and 
Kleinman (1966) have obtained very similar results using the OPW scheme. 
It is very likely that the major source of disagreement with experiment lies 
in the non-self-consistent crystal potential and probably not in more esoteric 
areas such as correlation corrections. 

One can go still further in relating the band results to experiment. A deter¬ 
mination of the deformation potential tensors for the bands at various points 
in fc-space will suffice to describe the effect of isotropic or uniaxial applied 
stress on the band energies. Furthermore, one can calculate all those trans¬ 
port properties which are dominated by acoustic rather than optical phonon 
processes. Thus the piezo resistance, low-temperature mobility, etc., can be 
evaluated theoretically. The deformation potentials for PbTe were found by 
Ferreira (1965) along the lines of the theory of Picus and Bir (1959). Rabii 
(1966) applied the same computer programs to PbSe and PbS. Once again 
agreement with experiment was excellent, usually lying within the experimental 
uncertainty. 

D. Speculations on new physical effects 

This final phase of the “ Materials Game ” is perhaps the most interesting. 
Our speculations will be confined to two areas; stress effects on Pb salt 
lasers and a Jahn-Teller effect for many valley semiconductors. 

Shortly after Ferreira had completed his deformation potential work, 
PbTe and PbSe diodes were operated as lasers. This suggested the possibility 
of applying uniaxial or hydrostatic stress and studying the change of frequency 
and polarization. The use of hydrostatic pressure had already been 
used on GaAs diode lasers by Feinleib et al. (1963) and uniaxial stress by 
Meyerhofer and Braunstein (1963). It was known, furthermore, that the energy 
gap decreased in the Pb salts under hydrostatic pressure, Paul (1961). Besson 
et al. (1965) have continuously tuned a PbSe diode from 8.5 microns to over 
20 microns which is certainly an important technical achievement. Pratt and 
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Ripper (1965) studied the effect of hydrostatic and uniaxial stress on the lasers 
radiation. Since the output follows changes in the energy gap, one can use 
the laser to make a very precise check on the deformation potentials. Consider 
the conduction and valence bands at L. They are found to be 

| u> = 0.943L 6 + (L+) + (0.228/V2){-/L 3 + 1 + L 3 + 2 } 

and (6) 

| c> = 0.807L 6 "(L 2 ) - (0.568/V2){L 32 + /L 31 }. 

Thus the valence bandwave function at L has symmetry and due to spin- 
orbit mixing, it derives partially from the L* level and partially from the L 3 
level. The change of the valence band per unit strain e in the (100) direction 
is given by 

<»|tf straln (100)|i>> = (.943) 2 <Lni00|L 1 + > + ( -^^[<L 3 + 1 |100|L 3 + 1 > 

+ <T 32 |100|L 32 >] + V2(.943)(.228)<L 1 + |100|L 3 + 2 >. (7) 

The matrix elements of the (100) strain Hamiltonian are given by Ferreira 
(1965) as -5.444, —4.449, +1.131, and —7.191 in the order they occur in 
the right-hand side. The matrix element <Lj 1 ' 1100|L 31 ) is identically zero since 
these states belong to different representations of the group of the strained 
crystal. Carrying through the evaluation of this expression one finds the 
valence band changes by — 7.12 eV per unit strain. A similar calculation for 
the conduction band shows that it changes by +1.28 eV per unit strain and, 
consequently, the gap changes by 8.40 eV per unit (100) strain.* Obviously, a 
great deal can be learned concerning the deformation potential tensors for the 
various bands and about the effect of spin-orbit mixing of the bands by study¬ 
ing uniaxial pressure tuning and the first work of this type has been done by 
Calawa et al. (1965). 

The above remarks assume only that the laser output follows changes in the 
energy gap regardless of the exact nature of the optical transitions involved. 
If they are band-to-band transitions, it is possible to use the momentum matrix 
elements required for the k-p study to predict the effect of strain on the polari¬ 
zation of the laser output. This is described by Pratt and Ripper (1965) and 
may prove to be an important technique for investigating the basic laser 
mechanism. 

In addition to the effects of static stress, very interesting possibilities exist 
in the application of time-dependent stress via sound waves. A compressional 

* This result differs from that given in Pratt and Ripper (1965) where a numerical error 
was discovered. A corrected version of those results and an extension to PbSe, PbS, and 
SnTe will be published shortly. 
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wave will modulate the index of refraction at the sound frequency co s . Since 
the frequency of the cavity modes of a small chip of semiconductor with two 
cleaved and parallel faces depends on the refractive index, the mode frequency 
will be modulated as well. Therefore, we get direct frequency modulation of 
the laser. Solving Maxwell’s equations for this case shows that the modulation 
index is proportional to (co 0 /co s ) (<5e/e) where co 0 is the frequency of the un¬ 
modulated laser and 5e/e is the fractional change in dielectric constant due to 
the sound waves. Since co 0 /co s is of the order of 10 8 , only a very small change 
in e is required to achieve a modulation index in excess of unity. As a result, 
it should be possible to divert a considerable amount of energy into the 
frequency modulation side bands. Work is now underway in this laboratory on 
this project. 

As a final topic let us consider the possibility of a spontaneous Jahn-Teller 
distortion in a multivalley semiconductor. The usual application of the Jahn- 
Teller effect is to a paramagnetic salt in which the paramagnetic ion has an 
orbitally degenerate ground state in the undistorted crystal. For example, the 
Cu + + ion in CuS0 4 -5 H 2 0 has a doubly degenerate ground state of T 12 
symmetry in the cubic configuration. This crystal suffers a spontaneous 
tetragonal distortion which splits the ground state. This splitting is linearly 
dependent on the strain e while the elastic energy is a quadratic function of the 
strain. Writing the total energy as 

E = \Ke 2 — As (8) 

leads to a value of £ = AIK for the minimum energy. According to standard 
first-order perturbation theory the electronic energy in a solid will vary 
linearly with strain for small s and again the elastic energy will go as e 2 . 
Therefore, we suggest that exactly the same mechanism for spontaneous 
distortion should prevail in a multivalley semiconductor. 

Keyes (1961) has already pointed out that the lattice constant c should be a 
function of carrier concentration N. Thus he writes the total energy E as 

E = (9/2)B(5c/c) 2 + N D ho (5c/c). (9) 

Here B is the bulk modulus and the first term is the elastic energy. The second 
term is the electronic energy. D iso is the isotropic deformation potential for 
the band occupied by the N carriers. Keyes predicts the lattice constant 
behavior for iV-type Ge to be 

dc/c = 1.4 x 10 -24 yV. (10) 

Therefore, one would require N to be of the order of 10 20 to get a measurable 
effect. For p-type PbTe the same calculation gives 

5c/c = 1.42 + 10 -23 7V, 
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an order of magnitude larger. The effect has been observed in rc-type Ge by 
Bruner and Keyes (1961) and Hall (1965) has measured the change with doping 
of the third-order elastic constants. 

The proposed Jahn-Teller effect thus goes one step beyond the Keyes 
proposal and suggests that the lattice constant will not vary isotropically 
with carrier concentration. It will be recalled that the mechanism responsible 
for piezoresistance in a multivalley semiconductor involves a splitting in 
energy of these valleys by an applied strain followed by a redistribution of 
carriers among the valleys. This Jahn-Teller effect suggests that this splitting 
and redistribution occurs spontaneously. The Keyes effect mentioned above is 
clearly only observable in highly doped materials and the same would be 
true of this Jahn-Teller distortion. 

III. Summary 

We have attempted to show in this paper how energy band theory has be¬ 
come an instrument in materials research. In the first place, it provides a 
theoretical model which can be used to interpret and unify a wide variety of 
experimental results. Secondly, the calculated effective masses, g-factors, and 
deformation potentials can be expected to be in reasonable qualitative 
agreement with experiment. Finally, one can make at least educated guesses 
as to how a material might behave under new circumstances. In this vein we 
have speculated about a possible frequency modulation of a semiconductor 
laser by sound waves and a Jahn-Teller effect in a multivalley material. 

There certainly are theoretical frontiers remaining in band theory such as 
self-consistency, correlation, and electron-phonon effects. However, the field 
has matured and is now bearing fruit in many areas. This harvest would 
never have been realized at this time had it not been for the enormous 
contribution of Professor John C. Slater. Since roughly 1930, he has mounted 
a sustained effort that has almost singlehandedly carried energy band theory 
to its present level of success. 
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Exploiting a suggestion by Aigrain (7), Bowers performed the following 
experiment: a very pure sample of sodium at liquid helium temperatures 
formed the core of a transformer, which was placed in a dc magnetic field. 
When a current in the primary was interrupted, eddy current decay in the 
sample induced a voltage in the secondary, which was displayed on an 
oscilloscope. Superimposed on the exponential was the oscillation predicted 
by Aigrain (7) and baptized by him, the “ helicon.” Bowers et al. (2) and Cotti 
et al. (2) pursued the oscillation. Helicons (“ magnetoplasma oscillations ” of 
plasma physics, “whistlers” of upper atmosphere physics) became a fashion¬ 
able source of Physical Review Letters (2-10) and even a useful tool for the 
study of solids (11-14). 

At first their physics was simple: From Maxwell’s equations, 


tia>, q) 4)' 


Consider first a nonmagnetic metal and ignore the small electronic susceptibility 
in the dc magnetic field, H 0 &. Then p(a>, q) = 1 and we desire the transverse 
components s ± (a>,q) of the dielectric tensor. A transverse perturbation 
traveling in the z direction, 

E ± = E 0 (x±i$)e i(qz - Wt \ (2) 

induces a response of the same form: 

P± = = - ne R±, (3) 

where R* is a displacement wave. From Newton’s laws, including the Lorentz 
force, one finds 

s* = l- v (4) 

co(co ± CO c ) 


* Present address: Department of Physics, Faculty of Engineering Sciences, Osaka Uni¬ 
versity, Toyonaka, Japan. 
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Right-hand circularly polarized light is reflected (e + < 0) but for left-hand 
polarization there are two transmission bands (e - > 0 for co < co c ; co > co p ). 
Ignoring unity in Eq. (4), and for frequencies well below co c substitu¬ 
tion of e~(co,q ) into Eq. (1) gives the quadratic helion dispersion relation 
C 2,11,12,15,16) 


to = 


coz 


( 5 ) 


At metallic densities the plasma frequency co p = 10 16 sec -1 and in a field of 
2 x 10 4 gauss the cyclotron frequency co c = 3.5 x 10 u sec -1 . Sixty-cycle 
light then has a wavelength of approximately } cm and a velocity of approxi¬ 
mately 35 cm/sec in the metal. 

Coupling of helicons and phonons is large only where their spectra cross 
(7,5): 


By knowledgable selection of a particular sound mode propagating in a 
special crystallographic direction in potassium with unusually low velocity 
v s , Grimes and Buchsbaum (9) were able to observe this crossing. 

Equation (4) gives only the real part of the complex dielectric function 

e ± = ef + i &2 • (7) 

There is also an imaginary term, known in the context of magnetoacoustic 
absorption as the “Kjeldaas edge” (17,18), and in helicon lore as “doppler- 
shifted cyclotron resonance absorption” (4,15,19,20). An electron on the 
Fermi surface with velocity component ±v along the field direction experi¬ 
ences a doppler-shifted frequency. When this frequency coincides with co 0 
there is cyclotron excitation of the electron, and hence absorption of helicons 
within the cone 


co = co c ± v F q. (8) 

Stern (4) proposed that onset of this absorption be used to study Fermi 
surface topology. 

The quantum theory of the dielectric tensor (21,22) differs importantly 
from the above. Helicon absorption occurs principally through electronic 
excitation between adjacent Landau levels, with the absorbing electron 
emerging from the Fermi sea. Thus ej(cu, q) is highly structured, with many 
shoulders and windows. The Kramers-Kronig relation shows that at each 
discontinuity in (f) there is a logarithmic singularity in e^co, q) which 
greatly enriches the dispersion relation (10). Without displaying the compli¬ 
cated dielectric function we can none the less understand some of its important 
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features. An electron on the nth Landau level, with initial z component of 
momentum hk t , has energy 

h 2 k 2 / 1\ 

E ’ = - 2 ^ + (" + 2 ) ,m ‘- (9) 

In absorption of helicon energy ha> and momentum hq the electron is excited 
to the (n + l)st level. Conserving energy and momentum, 



but k { and q are restricted by Pauli exclusion. To lie within the Fermi sphere 
on the nth Landau cylinder, 


|fcj(n)| < [k F — (n + 2 )&cl 1/2 

(11) 

k c = (2mcoJh) 1/2 

(12) 


Transitions among low-lying levels (those in which (n + %)k 2 c < kj) with 
k t ^ — k F and q^k F produce the lower Kjeldaas edge of Eq. (8) and 
numerous fine absorption filaments and slits adjacent to it. This can be seen 
by expansion of Eq. (11) and substitution into Eq. (10). These slits should 
produce sharp oscillations in helicon attenuation (5,6), and absorption of 
transverse sound (23,24). They are quite similar in structure to the filaments 
along the Kjeldaas edge in e\(co, q) due to An = 0 transitions, which are ex¬ 
pected to cause “giant quantum oscillations” in longitudinal sound absorp¬ 
tion (25). Small q transitions among inner levels, in which the electron has 
initial wave number k t = +k F , produce the upper Kjeldaas edge. There are 
also large momentum-transfer transitions among low-lying levels, those in 
which kf= —k F and the electron traverses almost the entire Fermi diameter. 
These transitions produce absorption bands atq = 2k F with windows between 
of width 2(oJv f . 

Absorption by electrons on levels near the highest level beneath the Fermi 
surface display the usual lfH oscillations. In Eq. (11) the highest level is 

(«max + — k 2 p. (13) 

Depending upon the field strength the range of momentum on this highest 
inscribed cylinder is between 

\k(n maK )\ < kjji and \k(n max )\ < k c . (14) 

Transitions among high-lying levels produce absorption bands near q = k c 
separated by windows of the same width (10). 
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In summary, sj(ct>, q) contains multitudes of fine streaks near q = co c /v F = 
10 4 cm -1 , broad bands and windows near <7 = k c = 10 6 cm -1 , and absorption 
bands near q — 2k F = 10 8 cm -1 separated by windows of width 2 c oJv F = 10 4 
cm -1 . 

The quantum mechanical eJ(co, q) is computed by equating the power 
absorbed from the wave to the rate of energy absorption by electrons in 
Landau states, calculated by the golden rule. From eJ(co, q) the real part of 
the dielectric function is found by the Kramers-Kronig relation. In contrast 
to the semiclassical picture of Eq. ( 8 ) there are now many windows within the 
doppler-shifted absorption cone. In contrast to Eq. (4) and its nonlocal 
extension, ef(cy, q) now displays wild oscillations near discontinuities in 
eJ(co, q). These oscillations grossly distort the dispersion relation within 
the absorption cone, decorating absorption filaments and bands with epiphora 
of semistable modes {10). For, in Eq. (1), suppose ej^co, q) is less than c 2 q 2 /a> 2 
at a frequency below an absorption band edge. The positive logarithmic 
singularity near the edge then enforces a solution of Eq. (1). 

Now consider the situation in a ferromagnetic metal. To a good approxi¬ 
mation the permeability, from the Landau-Lifshitz equation, is 

\ , y-4nM 

p. (co, q) = 1 + -- (15) 

co m - co 

with M the saturation magnetization and co m the uncoupled magnon frequency: 

u>m = yH m + xq 2 (16) 

Stern and Callen {20) show that the solution of Eq. ( 1 ), with £j"(co, q) from 
Eq. (4) and n~{co,q) from Eq. (15) is greatly different from the nonmagnetic 
case; there is strong magnon-helicon coupling at all q because of the high 
conductivity. Recall that in the semiclassical treatment eJ"(co, q)<0 for 
co c < co < co p : The permeability of Eq. (15) also has a negative region: 
M _ (co, q) < 0 for co m < co < co m +y4nM. Suppose co m < co c < co m + y4nM < 
cOp. Then the product fie > 0 in three passbands: 

0 <co<co m ; co c < co < co m + y 47 iM; co p < co. 

We have now re-examined this problem {10), employing the quantum- 
mechanical dielectric constant. We find coupled modes within the large 
windows in eJ(co, q). Their frequencies extend up to microwave and their 
wave numbers to q = k c far beyond the limit where any propagation could 
occur classically. 

We have ignored line width and losses. Line width will damp out modes 
close to the edges of absorption bands, and appreciable losses, of either 
nonmagnetic or magnetic origin, will cause even “ giant quantum oscillations ” 
to be barely perceptible. For this reason the new modes are known every¬ 
where as “Quantum Anomalous Greatly Obliterated Giant Oscillations ” 
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I. Introduction 

The purpose of this paper is to review recent advances made in the study 
of impurity effects in narrow bands. These are assumed to be described in 
a tight binding (LCAO) approximation, and the use of phase shifts for 
describing the perturbation is systematically developed. 

The bulk of the paper is specifically centered on the electronic structure of 
substitutional impurities in the d-bands of transitional metals, in structures 
with one atom per cell (i.e., practically fee and bcc structures); it assumes a 
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perturbation localized on the impurity atom and concentrates on effects 
related to the density of states. General conditions for the validity of the 
perturbation or rigid band approximation can then be set up; and the exact 
nature of the strong deviations from it due to various resonance effects can 
be defined. We shall show in particular that, strictly speaking, virtual bound 
d levels can only occur near a band edge, and under very restricted conditions. 

As stressed at the end of this paper, the phase shift method can be applied 
in other cases. Extensions to less-localized perturbations and to wide bands 
are developed. The possible use in the study of interstitial impurities, of 
matrices with more than one atom per cell, phonons, or spin wave problems 
must be mentioned. 

The model used for transitional metals is that originally used by Slater 
(1936) to discuss their magnetism; the work described here is a direct conse¬ 
quence of the type of analysis of perturbation in narrow bands first made by 
Koster and Slater (1954). However, in complex bands deduced from degener¬ 
ate atomic states such as the d-bands of transitional metals, the treatment 
given here allows a simpler analysis, because it brings out more clearly the 
symmetry of the problem. The explicit use of phase shifts is of especial 
interest for a general discussion on the densities of states. 

Section II of this paper describes the model used, introduces the general 
concepts, and studies the conditions of validity of the rigid band approxima¬ 
tion. Section III is devoted to the strong deviations from this approximation 
due to resonance effects. Section IV reviews some possible extensions of this 
type of treatment. 


II. The Impurity Problem for Localized Perturbations 
in Narrow Bands 


A. The model 

Having in mind a specific problem of transitional alloys, we shall make, in 
the main part of this paper, a number of simplifying assumptions. These will 
be analyzed and extended in the last part. 

7. Band Structure. The band structure for the valence electrons of a 
transitional metal is made of d and s overlapping bands, on which the 
following is assumed: (a) The s-d mixing effects are neglected, in pure 
metals and in alloys, (b) The d-bands are described by a tight binding 
method which neglects all overlap integrals except those involving the lattice 
potential and atomic functions centered on neighboring sites, (c) Spin orbit 
coupling effects are neglected, (d) The lattice structure is assumed to have 
one atom per cell, and the d wave functions |/) (i = 1, ..., 5) are chosen so 
that they are basis functions of the irreducible representations T of the point 
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group of the lattice. In practice, applications will be made to cubic (fee or bcc) 
crystals. The T representations are then T 12 or r' 25 . 

2. The Perturbing Potential. We study highly diluted alloys, in which each 
impurity atom can be considered as acting independently. These impurity 
atoms are assumed to be substitutional, so that the symmetry of perturbed 
lattice is that of the (pure) lattice point group. 

In the Hartree approximation, the perturbing potential V p (r ) is due to 

(a) the impurity excess charge and (b) its screening charge, mostly by d- 
electrons. Owing to the screening, V p has a finite range and, owing to the 
high density of d states, it is often well localized on the impurity cell. We 
shall assume for the moment this localization to be complete. Exchange 
effects, such as the possible apparition of localized magnetic moments, are 
not considered in this paper. 


B. The T-matrix and the phase shift operator 

From each wave function \(p (n) ) of the pure metal describing a one-electron 
state of energy E, one can define an eigenfunction of the perturbed crystal by 

|^(n)( + ) > _ T { + \E)\(p (n) y, (1) 

where T (+) (E ) is the scattering operator defined by 

T <+, (£) = i 1-+0. 0 > 


is the one-electron Hamiltonian of the pure metal. 

All the properties of the perturbed crystal can be deduced from a know¬ 
ledge of T (+ \E) and especially of the following: 

(a) the perturbed wave functions [see Eq. (1)]. 

(b) the variation of the number Z(E ) of states of energy E' < E. It is easily 
shown (Blandin, 1961) that it is related to the phase shift operator 5 (E) by 


Z(E) = -Tt e 5 
n 


( 3 ) 


where 

tan (5(E) = - nd(E - 3f 0 ) V p K(E) (4) 


The trace in (3) is taken over all the eigenfunctions of energy E and 
K(E)= T(E)(\ - nid(E-yf 0 )V p T(E))-\ 


(c) The variation of the density of states: in the low concentration limit, 
the change in density of states per unit energy, volume, and spin for a con¬ 
centration c is given by 


Sn(E) = c 


dZ(E) 
dE ’ 


( 5 ) 
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All these equations are quite general. The perfect screening of the impurity 
thus requires exactly 

Z(E f ) = Z, (6) 

where E F is the Fermi level. 


C. The choice of a representation 

The solution of Eq. (2) requires the choice of a representation. In the tight 
binding approximation, it is natural to choose a representation based on the 
complete set | iX) of atomic d functions |/) centered on the crystalline sites R A . 
This will be called the {/A} representation. It neglects the overlap integrals of 
atomic functions on different sites, and is especially useful if the perturbation 
potential V p is well localized on the impurity cell. In the spirit of the tight 
binding approximation, one can then neglect all the matrix elements (z'A| V p \jp) 
except those (/0|F p |j0) involving the atomic functions on the impurity site. 
For a substitutional impurity in a lattice with one atom per cell, V p has the 
symmetry properties of the point group of the lattice. The matric elements 
(iO\V p \jO) then vanish except if the / and j atomic states are identical. The 
corresponding integral (/0| V p \iO) = V p (T) only depends on the nature of the 
corresponding class T in the irreducible representation of the point group of 
the lattice. Finally, 

{iX\V p \jp) = 5 ij 5, p 5 xo V(T). (7) 

In the cubic (fee or bcc) lattices, there are only two values of F(T), corres¬ 
ponding to T = T 12 and r' 25 . 

Introducing the Green functions 


G\p(E, Xp) = (iX 
we have then from Eq. (2) 


1 


E- 2 ?o + it] 


jp). 


0*0| T + (E)\iO) = [1 - V{T)G\t\E, 0, 0)]“ 1 
and the scattering operator is defined in the {iX} representation by 


( 8 ) 

(9) 


(a\r + \E)\M = s Xtl s 




1 V(Ti)G\p(E, 0, 0) 

+ (1 - <5ao){(1 - + - 


SroVjrj)G\p(E,X, 0) 1 
1 - V(Tj)G\p(E, 0, 0)j 


( 10 ) 


It should be stressed that the use of the {iX} representation would be 
strictly equivalent to that of the Wannier representation first developed by 
Koster and Slater (1954), if the narrow band considered were derived from 
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nondegenerate atomic orbitals (Wolff, 1961; Clogston, 1962; Seeger and 
Staatzs 1962; Goodings and Moser, 1964). But, for narrow bands derived 
from degenerate atomic orbitals such as the d band of transitional metals, 
the use of the Wannier representation leads to much more complex equations 
than the {/A} representation. The physical reason is that each d subband does 
not, in general, possess, as a whole, the symmetry properties of the lattice, 
which therefore do not come out clearly from the equations. As a result, no 
simplification comparable to Eq. (7) occurs in the Wannier representation: 
the matrix elements of V p between Wannier functions of different d sub¬ 
bands are usually different from zero; the Wannier functions of one subband 
are not well localized on an atomic cell, so that important matrix elements of 
V p can involve Wannier functions centered on other sites than the impurity 
one, even when V p is itself well localized on the impurity site (Gautier and 
Lenglart, 1965). 


D. Main results in the {/A} representation 

From the knowledge of T (+) (F), Eq. (10), we can formally deduce the wave 
functions and the displaced charge. The corresponding changes in density of 
states will be discussed in full. 

1. Wave Functions. The perturbed function \\J/ in)(+ \k)y related by Eq. (1) 
to the Bloch function \q> (n \k )> of the nth subband with wave vector k is 
given by: 


(a|^ nK+) (k)) = X OV^OO) 

j 


<5 l7 e ,ftJU 


+ 


V(Tj)G\p(E, A, 0) I 

1 - nr,H +) (£,0,0)J 


( 11 ) 


Note that there is a mixing of the various i components of the wave functions 
of the pure matrix: this is because each Bloch function is itself a mixing of 
orbitals of various symmetries. It will be seen below that the scattering 
amplitude in Eq. (11) is related to the phase shifts. 

2. Displaced Charge Z(E). From (3), the charge Z(E) is given by 


Z(E) = - I>(F, r), 
n r 


( 12 ) 


where p is the dimension of the irreducible representation T. Thus, for cubic 
lattices, p = 2 for T l2 and 3 for T' 25 • From Eqs. (4) and (9), we then have 


5(E, r f ) = tan" 1 


— nV(Ti)n(E, T,) 1 

.1 - V(TdF{E, T,)J ’ 


(13) 


where -nn(E, T.) and F(E, T,) are respectively the imaginary and real parts 
of (F, 0,0). 
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From definition (8), n(E, T,) is the density of occupation of one of the |/) 
orbitals of the F ( - class at energy E and per unit of energy, spin, and atomic 
volume 

n{E,T^ = (i\5(E-tf 0 )\i). (14) 


It is given, in the Bloch representation, by 


/;(£, r ( .) = 


Q 


(2tt)- 


I 


» ' S „( E ) 


dS n 

|V k £ n | 




(15) 


is the atomic volume, and the integration is taken over the surfaces S n (E) 
of constant energy E = £„(k). The orthonormalization of the Bloch functions 
<p (n) (k) leads to the sum rules 


r%(£, T i )dE= 1, (16) 

J -00 

^pn(E,T) = n(E), (17) 


where n(E) is the ordinary density of states per unit energy, spin, and atomic 
volume of the unperturbed matrix. 

From Eq. (8), the function F(E, T;) is similarly the Hilbert transform of 
><E, r ; ): 


F(E, r,) = 


r n(E\ r,) d£ 
J E- E’ 


(18) 


P is the Cauchy principal part. 

We can note from Eqs. (12) and (15) that completely filled or empty sub¬ 
bands do not contribute directly to the total displaced charge, i.e., to the 
screening, although their presence affects contribution of the partially filled 
subbands through their contribution to F(E, T f ). They also contribute 
directly to the spatial variation of the displaced charge through their action 
on the wave functions [Eq. (11)] and eventually through the occurrence of 
bound states extracted from them. 


III. Change in the Density of States due to Alloying 

A. General considerations 

This important aspect of the electronic perturbation due to impurities can 
be fairly directly related to changes in measurable properties (e.g., electronic 
specific heat, paramagnetism, Knight shift). These experimental results have 
been extensively discussed using either a “rigid band model” which assumes 
that the extra electrons of the impurities fill the d band of the matrix without 
otherwise changing its form (Mott and Jones, 1936), or the concept of 
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“ virtual bound state,” which assumes that the d states of the impurity poorly 
mixes with the d band of the matrix (Clogston, 1962). 

In this chapter, we want to discuss the conditions of validity of the rigid 
band model and the nature of the strong deviations from that model that 
can be due to resonant effects, leading eventually to the formation of virtual 
bound states. We shall show how the use of the phase shift concept helps in 
obtaining fairly general results without having to work out numerical results 
on a specific model. 

In the low-concentration limit, the change in the density of states due to 
alloying is directly related to the energy variation of the displaced charge Z(E), 
thus of the phase shifts S(E, T). Equation (5) gives 



(19) 


The contributions Sn(E, T f ) of the various symmetry classes T are therefore 
additive, and can be discussed separately, using Eq. (13) for computing the 
phase shifts 5(E, T). 

We can assume n(E, T) and dn(E, T)jdE to be continuous in E nearly 
everywhere in the band, from which we obtain the following properties, 
which will be useful in the discussion: 

Pl. F(E, T) and dF(E, T)jdE are continuous in the energy ranges of con¬ 
tinuity of n{E, T) and dn(E, T)/dE, respectively. 

P2. When n(E, T) has a first type of discontinuity for E = E lt i.e., 
n{E 1 + 0) ^ n(E t — 0), F(E, T) has a logarithmic infinity for E = E v The same 
applies for dnjdE and dFjdE. 

P3. For E^ E t , energy at the top of the band, F(E, T) is positive and 
decreases with increasing energy E ; in the same way, for E < E b , energy at 
the bottom of the band, F(E, T) is negative and decreases with increasing 
energy. Also, from Pi there is necessarily an energy E 0 for which F(E 0 , T) = 0. 
Finally F(E, T) has generally a relative maximum of amplitude near the 
band edges. 

P4. When the density of states n(E, T) is zero at a band edge, we have for 
that energy ( E = E b or E t ) 


dS(E, D _ nV{T) dn(E, T) 

dE 1 - V(T)F(E, T) dE 


( 20 ) 


Finally, for a solution E t of the equation 

1 - V(T)F(E, T) = 0 


( 21 ) 


we have 


dd(Ei, T) _ dF(Ei , T)jdE 
dE n(E t , r) 


(22) 
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P5. Expanding Eq. (15) in powers of E t — E gives 
„(E, D= Z l(il<p<">(k„)>lW n, (E) 

n 

+ 75 % < k - k -)Vi • Vi„l(/|<?<” , (k)>| 2 (k - k,„). (23) 

2(2n) J Js„(£) |V k £|„ 

The vectors k,„ are at the top of the band structure, and the centrosymmetry 
of the surfaces S n of constant energy has been used. From a theorem by 
Van Hove (1953), the first term is of order (E, — E)° or (E, — E) 1/2 , the second 
of order (E, — E) 3/2 . The behavior of n(E, T) near a band edge therefore 
depends on whether the integrals (/|(p ( " ) (k,„)> corresponding to all the atomic 
orbitals |/) belonging to T vanish or not. 

For instance, in fee nickel or cobalt, the top of the d band occurs at the 
X points in k-space, and the wave functions at the top of the band have only 
r' 25 components (Fletcher, 1952). In bcc iron, the top of the band seems to 
be at the H points, and only r ]2 components occur (Abate and Asdente, 
1965). 

B. Conditions of validity of the rigid band model 
To first order in the perturbation V(T), Eq. (13) gives 

<5(£, r)= -nV(T)n(E, T) + 0 2 (V). (24) 

The contribution of the T class to the change in density of states is thus: 

in(E, D = - V(D + 0 2 ( V). (25) 

The perturbed density of states n(E, c ) for a concentration c of impurity can 
thus be written 

2>(£-cF(r),r). (26) 

r 

Thus, to first order in F(r) the contribution of the T class to the density of 
states is thus rigidly shifted by an amount cV(T), without change of form. 
If the F(T) are equal for all the classes, the whole density of states n(E) is 
shifted by cV(T) without change in form.* 

The rigid band model is thus valid in the Born (first order) approximation, 
but only if crystalline field effects are neglected. In cubic lattices, the two 

* The same result can of course be deduced from the change of energy, to first order 
in V, of a Bloch state: 

£<">(k) = <<p ( ">(k)| Fp|<p (n) (k)> = 2 F(r)|<<p<">(k)|/> 

l 

is independent of n and k if all the F(T) are equal. 
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F(r)’s would strictly be equal if V p had spherical symmetry. Their difference 
is usually small compared with the bandwidth, and this explains the fair 
success of the rigid band model for substitutional impurities with small Z’s, 
thus near to the matrix in the periodic table. 

We wish now to discuss possible strong deviations from the Born approxi¬ 
mation required by the rigid band model. These will occur whenever either 
of the two following conditions is not fulfilled: 

| V(T)F(E, OKI (27) 

or 

| V(T)n(E, T)H1. (28) 

Deviations are, of course, expected for large perturbations F(T); for a given 
perturbation, they are expected mostly in the following zones of energy: 
(a) outside the band (bound states), (b) within the band, near the band edges 
[see P3 and and (27)], (c) near peaks of the density of states: n(E, T) or 
F(E, T) take then large values [cf. (27) and (28)]. 


C. The occurrence of bound states 

When F(T) ^ [F(E t , T)] -1 , the scattering amplitudes of Eq. (11) are infinite 
for the energy E s >E t of a bound state, solution of 


F(E s ,T)=\jV{T). 


(29) 


From the monotonic variation of F(E, T) outside the band, there is only one 
(possibly degenerate) solution for each class T at a given value of F(T). 
Thus in the cubic lattices, two degenerate T 12 and three degenerate r ' 25 bound 
states are separately extracted from the d band. Similar conclusions would 
apply for a strong enough attractive perturbation and bound states below the 
band. The bound state transforming like the orbital |?) has an amplitude on 
the central cell which is easily obtained: 


I O'I •/';(£*)) 1 2 


F(F s ,r ) 2 
dF(E s , T)/dE' 


(30) 


We can note that the extension of the bound states, when just extracted 
from a band, are very different depending on the way n(E, T) varies near 
the band edge. 

Thus, when n(E, T) = a (F t — E) 1/2 , the bound state has an infinite extension 
when it emerges from, the band. When n(F, T) = (x(E t — F) 3/2 , it emerges with 
a finite extension. 


D. The occurrence of virtual bound states near a band edge 

As for free particles, a virtual bound state will be defined as a positive peak 
produced in the density of states by the perturbation. To be a proper virtual 
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bound state, this peak must be narrow compared with the features of the 
density of states of the matrix; its area must also correspond, at least roughly, 
to the number of bound states that could occur in the same class. In other 
words, <5(F, T) must vary by an amount near n over a narrow range of energy. 

The possible occurrence of such virtual bound states of a given class near 
a band edge is shown to depend on the behavior of the corresponding density 
of states n(E, T) near that edge. We shall show that a repulsive perturbation 
can only produce a virtual bound state of a given class near the top of the band 
if the wave functions at the top of the band do not contain atomic orbitals of 
the same class. A similar result can be obtained for attractive perturbations 
and the bottom of the band. 

We examine the deviations from the rigid band model produced by a 
repulsive potential V p near the top E t of the d band. V p is thus too strong for 
conditions (27) or (28) to apply; but we assume it is not strong enough to 
extract a bound state. Thus, from (29), F(T) < F(E t , T) _1 . From P2, this 
requires n(E t — 0, T) = 0: virtual bound states of the T class cannot occur 
near an edge with a finite corresponding density of states. From P5, the only 
cases to discuss are therefore those where n(E, T) = a(F, — F)”, with/? = ior-f. 
F(F, T) is then finite and continuous at E = E, . If it has a negative slope just 
below E t , Eq. (21) has at least two solutions F t - (/ = 1 and 2) within the band: 
Fi ^ E 2 < E t . The scattering cross section which appears in Eq. (11) has 
a finite maximum for F = F 2 . We shall show however that this “ resonance” 
condition is not sufficient to produce proper virtual bound states, contrary 
to what has been sometimes stated (Wolff, 1961; Clogston, 1962). 

(a) n(E, T) = a(F f — F) 1/2 (i.e., the Bloch states at the band edge have T 
components). The phase shift <5(F, T) decreases from 0 at the band edge with 
decreasing energies, with an initial infinite slope [P4, cf. Fig. 1 ]. 

When n(E, T) is less than its parabolic development over all the d band, 
it can easily be shown that F(F, T) has a positive slope just below E t (Gautier 
and Gomes, 1965). 5{E, T) is then always less than n/2 [Fig. 1(a)], and no 
resonance occurs. This general case is similar to the behavior of the “ 5 ” phase 
shift in the scattering of particles by a spherical potential. 

If n(E, T) is greater than its parabolic development, F(F, T) can have a 
negative slope just below F, [Fig. 1(b)]; in that case, it has necessarily a 
maximum within the bandwidth. Equation (21) has then at least two solutions 
Fj and F 2 , between which |<5(F, T)| > n/2. However the change in density 
of states due to the impurity, proportional to dd/dE, decreases continuously 
from + co to negative values when the energy F decreases from E, to F; there 
is usually no maximum of dd/dE in this range of energy. Figure 2 shows, as an 
example, the various functions of interest for a localized potential acting on 
a non-degenerate band in the simple cubic structure. In fact a strong peak 
in the density of states near E t is generally required to produce such a 
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i/v, 




i/v, 




dn(E,0) 

dc 



Fig. 1 (a). F(E), 8(E), and dn(E, 0)1 dc for n(E) less than its parabolic development 
(schematic). 

(b) F(E), 8(E), and 8n(E, 0)1 dc for n(E) greater than its parabolic development 

(schematic). 

maximum. The corresponding resonant state will be studied below (Section 
III, E); it is not a proper virtual bound state. 

In conclusion, whatever the form of F(E, T) near the top of the band, no 
virtual bound state can be produced if n(E, T) = a(E t — E) i/2 . 

(b) n(E, O = a(E t — E) 3/2 (i.e., the Bloch states at the band edge have no 
T component). From the continuity of dn(E, T)/dE at the band edge and from 
P2, one deduces that dF(E, T)/dE is continuous at the band edge, and there¬ 
fore negative. Also, from P4, the phase shift has a vanishing slope at the 
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Fig. 2. Phase shift curves for a nondegenerate band in the simple cubic structure. 


band edge. It is then easily shown that a virtual bound state necessarily occurs 
near the band edge when the perturbation is not quite strong enough to 
produce the corresponding bound state. For, when F(r)-+ [. F(E t , T)] -1 , the 
phase shift \8{E, T)| which starts from E t with zero slope, crosses the value 
7 t/2 and reaches values near to n in an energy range that tends toward E t 
(Fig. 3). The variation is similar to that of the “/?” phase shift for free elec¬ 
trons. 

The virtual bound state produced has the following characteristics: 

(i) The top of the peak in the density of states occurs at an energy E c defined 
by d 2 5jdEj = 0. It is usually distinct from the resonant energy E 2 such that 
8 = 7i/2; but both energies tend towards E t when F(T)-» [F(E t , T)] -1 . 

(ii) Its width is of the order of — n/(d8jdE 2 ) = — nnj(dFldE 2 ), and it de¬ 
creases as ( E, — E 2 ) 3/2 when the state tends toward the edge. 

(iii) The number of supplementary states per impurity atom, deduced from 
the area of the peak tends toward the number of atomic states of the T class 
when E 2 ~* E t . 

(iv) These states are subtracted from an energy region centered around the 
other solution E x of Eq. (21). But it is usually subtracted from a large part 
of the band, so that there is no marked antiresonant peak in the density 
of states. 







On the Use of Phase Shifts 


457 


i/v 



Fig. 3. F(E), 8(E), and 8n(E, 0)1 dc for a band such that dn(E)/dE = 0 (schematic). 

E. Resonant and antiresonant effects near to a peak of density of 

STATES 

We shall now analyze briefly the possible effects of features of the density 
of states that occur within the band, away from the edges. We shall distin¬ 
guish the possibilities of strong regular peaks of density and of Van Hove 
anomalies. We shall show that the resonant effects possible in these cases are 
not proper virtual bound states. 

Let us assume that a high and sharp peak p, of density n p (E, T), is super¬ 
imposed on a regular density of states n 0 (E, T). 
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For very small F(r)’s, when the rigid band model applies, the peak p 
produces in the same energy range a peak in the change of density of states 
due to alloying. The total number of states per impurity atom is propor¬ 
tional to F(T) and to the total number of states in peak p. 

We next consider V(T) large enough for the condition (b) | V(T)F p (E, T) | > 1 
to apply near the energy Ef at the top of the peak p but with still (a) V(T)n 0 (E,T) 
and V(J~)\F 0 (E , T| 1 over the whole range of energy of the band 
(E b <E<E t ). 

The corresponding variation of 8(E , T) is shown in Fig. 4. From (a), it has 




Fig. 4. Effect of a sharp peak in SfE) and 8n(E, 0 )/8c (schematic). 
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very small values except near the peak. From (b), condition (21) is fulfilled 
for an energy E R > E p t as well as for an energy E AR which falls within the 
peak. For decreasing energies, the phase shift 5(E, T) thus decreases sharply 
from small values to values near to — n around E R , then increases sharply 
from near to — n to near to zero around E AR . For energies below the peak, 
5(E, T) is lower than the value 5 0 (E, T) deduces from the action of F(T) on 
n 0 (E, T) only. 

The change in density of states due to the impurities has thus the following 
characteristic features in this special case (Fig. 4): 

A sharp antiresonant peak within the peak p of the matrix. The width 
= E p t — E ar of this peak is less than the width of the peak p and the high 
energy side of it for repulsive perturbations. The number of states per im¬ 
purity atom contained in this antiresonant peak is nearly equal to the de¬ 
generacy of the T class in the limit of a large and narrow peak. 

A resonant peak in the energy range above the peak (E p < E < E p + A R ). 
Its width A r is much larger than that of the peak p. It can be shown (Gautier 
and Gomes, 1965) that A R increases with the number of states contained in 
peak p, so that the resonance peak is well marked only if this is a small 
fraction of the total number of T states in the d band. The number of states 
per impurity atom contained in the resonant peak is only near to the de¬ 
generacy of the T class if the phase shift 5 0 (E, T) due to the broad part of 
the band is very small. This requires that n 0 (E) be small, in which case, as 
we have seen, the resonant peak is broad. 

An increase in density below Ear > with a large width, of order A R . The 
corresponding number of states must of course be just equal to the difference 
between the'antiresonant and the resonant peaks. 

These results are preserved qualitatively for still larger perturbations, such 
that V(T)\Fq(E, T)| > 1, but with a smaller resonant peak and a more complex 
variation of the phase shift. 

It can be said in a qualitative way that the resonant state described in this 
section is due to the d states of the matrix originally in the peak p and shifted 
in energy by the perturbation; in the same way, the antiresonant state 
describes the subtraction of those states from peak p. We have seen however 
that the resonant state is only well defined when its area is small. It should 
therefore not be confused with a proper virtual bound state. 

F. Van Hove singularities 

Van Hove singularities (Van Hove, 1953) are expected in all bands, and in 
particular in the d band of transitional metals. Indeed most of the peaks in 
their density of states are probably nonregular and correspond to such 
singularities (cf. Wohlfarth and Cornwell, 1961). 
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At the energy E s of such a singularity, the break in dn(E, T)/dE produces 
an infinity in dF(E, T)/dE (cf. P2 ). The phase shift 8{E, T) has thus an 
infinite slope; the change in density due to alloying has a very sharp peak 
with an infinite height. 

Two points should be stressed: These phase shift singularities are localized 
in a narrow range of energy [Figs. 5(a) and 5(b)]. As an example, the singu¬ 
larities related to one nondegenerate band in the simple cubic structure are 
shown in Fig. 2. The infinity in the change n(E, c ) — n(E, 0) of density of 
states has no physical realty. It only means that the density of states n(E, c ) 
of the alloy varies more slowly than the concentration c near the Van Hove 
singularity E = E s . A similar effect has been discussed by Lifshitz (1964). 


IV. Extensions 

It is now of some interest to discuss in what directions the preceding type 
of phase shift analysis can be extended. We develop the case of extended 



Fig. 5(a) and (b). Effects of a Van Hove singularity (schematic). 
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perturbations (for narrow bands) and of broad bands, then list some other 
possible extensions. 

A. Extended perturbations 

We now assume that the perturbation potential V p is no longer localized 
on an atomic site; we still assume, though, that it retains all the symmetries 
of the lattice point group. This type of extension is of interest in discussing 
substitutional impurities far from the matrix in the periodic table, such as 
Cr, V, Ti or Zn, Al, Si in nickel, where the perturbation is known to affect 
strongly the neighboring atoms in the matrix. 

The use of the {/, X} representation defined in Section II,C would then lead 
to a more complex set of equations. But we can construct from it a new 
representation (T, u}, which takes account of the extension and of the sym¬ 
metries of the perturbation.* 

Let us denote by X the set of X sites at an equal distance to the central site. 
With the corresponding atomic functions |/A>, we can build linear combina¬ 
tions which transform according to various symmetries of the lattice point 
group. The complete set of these combinations, which we shall denote {TX} 
forms a basis for the representations T of the lattice point group. However 
these representations are reducible; some of the combinations corresponding 
to different distances can have the same symmetry properties. We must then 
form linear combinations of these new functions with different distances to 
obtain a possible basis for all the irreducible representations T of the lattice 
point group. This set will be denoted {Tu}. Each linear combination obviously 
contains only atomic functions with a given symmetry: for d bands in the 
cubic lattices, d functions of the Ti 2 or r 25 class. 

The advantages of this new representation are twofold: 

(a) The scattering operator T (+ \E) induces transitions only between 
atomic functions on perturbed sites. Thus only representations T with basis 
functions containing atomic functions on perturbed sites will be affected. 
The number of these representations is at most that of the lattice point group; 
it can be much lower if only a small number of neighbors is perturbed by 
the impurity. The corresponding number of basis functions is at most equal 
to the number of perturbed sites; it can be lower, if linear combinations at 
different distances X have the same symmetry properties. 

(b) The scattering operator is invariant under the operations of the lattice 
point group operator. It therefore is diagonal in the {Tu} representation. For 
the representations T where {TX} is irreducible, T (+ \E) thus is diagonal in 

* The representation used here was introduced by Gautier and Lenglart, (1965). It is 
either simpler and more general that those used by Olszewski (1962), Kanamori (1965), 
Callaway (1964). and Clogston (1964). 
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{Fk}; for the representations T where {Hi} is built up by linear combinations 
of ]0,> functions with p p different perturbed distances X, the matrix elements 
of T + (E ) in {Tk} are given by a set of difference equations of order p For 
perturbations with fairly close range, these number p^ are very small, and the 
computation of the scattering operator thus very simple (cf. Gautier and 
Lenglart, 1965). 

From Eq. (4), the phase shift operator has the same properties as the 
scattering operator. The change in the number of states of energy less than 
R, E due to an impurity, can then be written from Eq. (3) as 

Z(E) = (l/n)l J n r d(E,r). (31) 

r 

Here n r is the dimension of the irreducible representation T and use has been 
made of the fact that the various |Tu) functions of the same representation T 
and with the same energy E must, by symmetry, have the same phase shift* 
5(E, r) = <ru|<5|ru). Each irreducible representation T thus contributes 
separately to the change in density of states dZ/dE, and the discussion on 
the change in density due to alloying can proceed in much the same way as 
for localized potentials. Two complications are introduced however: (a) The 
impurity problem now depends on several parameters, corresponding to the 
action of the perturbation potential on the various types of neighbors to the 
impurity atom. The screening condition Z(E F )=Z is no longer sufficient to 
fix these parameters, (b) The relation between phase shift and perturbation 
potential can involve, for some representations T, the solution of a (simple) 
difference equation. 

B. Broad bands. The generalized Wannier functions 

Such an extension would be of great interest to treat the s band of normal 
or semimetals or indeed of transitional ones, or to compute transport effects 
in covalent or ionic insulators. 

It is known (Des Cloizeaux, 1964a) that, in many cases, especially when the 
lattice has a center of inversion, it is possible to define generalized Wannier 
functions which (a) transform according to the various representations of the 
lattice point group, (b) decrease exponentially with distance. The preceding 
method can then be formally extended, replacing the atomic functions by 
these generalized Wannier functions. All the general results described in the 
preceding section on extended perturbations hold exactly, including the phase 
shift formula (31). 

* T he present formula (31) is identical to that obtained for free electrons scattered by a 
central potential well (Friedel, 1954). In the lattice case, the finite number of phase shifts is 
related to the finite order of the lattice point group; in the case of spherical symmetry, the 
infinite number of phase shifts is related to the infinite order of the rotation group. 
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For actual computations, this general method is probably not yet practical, 
because these Wannier functions are not known with enough accuracy. Al¬ 
though a general theorem concerning their spatial extension is lacking, it can 
be expected that, in broad bands, they extend too much over several sites 
for the simple approximation of localized perturbation to hold: the action 
of the perturbing potential on the Wannier functions of neighboring sites is 
probably never negligible in broad bands. 

C. Other possible extensions 

The phase shift method can of course be extended in various ways. For 
studies of electronic structures, other cases of interest for narrow bands would 
include bands with other than d symmetry. The method can be similarly 
applied to the perturbation by impurities of the phonon or the spin wave 
spectra. For electrons, phonons, or spin waves, cases of defects of lower 
symmetry could also be considered: substitutional impurities in lattices with 
more than one atom per cell, interstitial impurities, pairs of impurities, etc., 
linear or planar defects (dislocations; free surfaces, stacking faults, twin 
boundaries...). 

The analysis should then be made in terms of the {Tu} representation 
corresponding to the irreducible representations T of the perturbed lattice. 
The choice of linear combinations of atomic functions |Tu) is then less clear, 
except in cases of high symmetry, such as the interstitial impurities, the sub¬ 
stitutional impurities in the NaCl or diamond structures, a free surface 
parallel to a close packed plane, etc. It is clear that, the lower the symmetry 
of the defect, the less interesting the phase shift method becomes. 
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I. Introduction 

The band theory of electrons in solids though usually highly successful 
becomes invalid in certain cases. There are two possibilities for this break¬ 
down; the one first indicated by Mott (1949) refers to a simple crystal of 
one-electron centers (hydrogen atoms) which according to the band model has 
a half-filled band and is thus a metallic conductor. When the lattice distance 
is very large, however, an electron bound to one center overlaps very little 
with its neighbors. In the ground state each center should thus have an 
electron localized near it. A finite energy is then required to create a pair of 
free carriers; the model thus behaves as a semiconductor. The second pos¬ 
sibility arises in cases of strong interaction of a single electron (in a nonmetal) 
with the ions of the lattice (cf. Frohlich et al. 1963, where earlier literature is 
given). In some cases, this would lead to extremely narrow bands; if the energy 
broadening due to the interaction of an electron with the thermal vibrations 
of the lattice becomes larger than this bandwidth, the band model is no longer 
useful. Conduction then takes place by a jumping mechanism. From the un¬ 
certainty relation (Frbhlich and Sewell, 1959) it can be estimated that this 
situation is likely to occur when mobilities less than 0.1 cm 2 /sec-V are found. 

The two breakdowns of the band model are probably not unconnected. 
Thus many oxides, such as V 2 0 3 which according to the band model should 
be metals, behave as semiconductors at low temperatures and, moreover, have 
very low mobility. 
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Mott has also predicted that, using the lattice distance as a parameter, a 
phase transition from the nonmetallic (large lattice distances) into the metallic 
state (small lattice distances, band model) should occur at a critical lattice 
distance. V 2 0 3 and other oxides show a phase transition from a low- 
temperature nonmetallic state to a high-temperature metallic state which is 
probably not unconnected with Mott’s prediction (cf. Mott, 1961). 

Mott is largely concerned with the properties of the system in the immediate 
neighborhood of the ground state. It is quite feasible, however, that for given 
lattice distance the ground state of a system is nonmetallic but that it is 
metallic in higher excited states. 

II. A Simple Model 

It is instructive in this connection to illustrate the breakdown of the band 
model (first case) in terms of the number of levels in a narrow continuum 
above the ground state (Frohlich, 1955). If the band model holds, then a total 
number of N electrons and N centers (N hydrogen atoms) provides a band 
with IN one-electron states (two spin directions). For large lattice distances 
this band is very narrow. It hence leads to a narrow continuum of the whole 
system containing 



levels. 

If, on the other hand, the ground state is described by electrons localized 
on the centers (one electron per center) then the spin degeneracy leads to a 
narrow continuum containing only 

Z, = 2* (2) 

levels, i.e., a number which for large N is negligible compared with Z b . 

If in this localized model we would, however, permit each center to be 
occupied by up to two electrons (opposite spin) then the number of states 
would be Z b as in the band model. These states form however a continuum 
only if the energy required to move an electron from one center to another 
(not a neighbor) vanishes—or is extremely small. 

Assume now the ground state to be described in terms of localized electrons, 
i.e., that the band model is invalid. Let / 0 denote the minimum energy required 
to create a pair of carriers i.e., to remove one electron to a distant site. This 
energy involves long-range polarization effects. Now, if a number of such 
carriers has been created then they in turn will influence the energy / required 
to create a further pair of carriers. This influence will express itself through 
screening effects such that / is smaller than I 0 . Let N c be the number of 
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negative (and positive) carriers and let 

x = NJN. 

We then assume / to depend on x only, 


(3) 


(4) 


A condition for a transition of the system into the metallic state is thus to 
assume that /(x) becomes zero for x > x u with ^ i-e., 


f(x)=l if x = 0; /(x) = 0 if x t < x < l. (5) 


The screening effects leading to the decrease of/(x) with x might be cal¬ 
culated with the help of approximations like the Debye-Hiickel formula. 
A simple-minded treatment of this nature is of course objectionable. Never¬ 
theless it seems instructive to show that it may lead to a first-order phase 
transition. 

For this purpose we determine x from the thermal equilibrium condition 
(similar to ionization equilibrium) which at temperature T yields 



( 6 ) 


Here X(T) increases with T ; treatment of the carriers as free would require 
A(T) oc r 3 . Thus x must satisfy the equation 



(7) 


The left-hand side decreases from + oo at x = 0 to zero at x = x 0 and is 
negative for 1 > x > x 0 . Here, 



( 8 ) 


i.e., x 0 ^ A 1/2 if A <£ 1; x 0 ~ 1 — 1/A if A 1. Note that A 0 as T -* 0. 
Hence, for low temperatures, x 0 < x,; Eq. (7) will then have one solution x f 
with x, «^ 1, the usual excitation of a few carriers x f oc exp( — ^IJkT). With 
increasing T, however, A and hence x 0 increases and finally x 0 becomes larger 
than x,. At high temperatures, therefore, there are at least two, possibly more, 
solutions of Eq. (7), depending on the detailed shape of/(x). The system will 
choose the solution with the smallest free energy. We may then expect a 
transition from the semiconducting state with x, <£ 1 to a state in which x 
is close to unity, i.e., a metallic state. 


III. General Remarks 

The approximations used in the treatment of phase transitions frequently 
break down just near the transition temperature. This also holds of the model 
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suggested in Section II. Much work has been done on the theory of Coulomb 
interaction of electrons in the band model; one might, therefore, 
consider taking the band model as a starting point. For the band wave 
functions form a complete set of functions and are thus capable of describing 
states in which the band model is invalid. Taking the density (or lattice dis¬ 
tance) as a parameter one should then expect for high densities a continuum 
of Z b [cf. Eq. (1)] states above the ground state. At lower densities a number 
Z,[Eq. (2)] of levels should be split off (to lower energies) from this continuum. 
Note that Z,/Z 6 = 2 -A is negligibly small so that the number of states in the 
former continuum is hardly changed. Unfortunately, the work on Coulomb 
interaction of electrons in the band model has been justified for the high- 
density limit only. New methods should thus be devised to deal with the low- 
density case. It would not be surprising to find that the ordering influence 
of the Coulomb interaction may have serious repercussions on certain 
properties even at densities at which the Z, ordered states are still within the 
continuum of the Z b band states. This should apply in particular to metals with 
incomplete shells. It will be remembered that here the band model is in 
difficulties with regard to both superconductivity and ferromagnetism. 
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Energy Bands in Periodic and Aperiodic Fields 
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Certain aspects of the thermal and magnetic properties of metals, interpreted 
in accordance with the independent electron model, depend rather sensitively 
on the density of electronic states in the highest occupied energy band. For 
the periodic field of the perfect crystal, mathematical methods have been 
developed which lead to a determination of this density-of-states function. 
These methods depend primarily on Bloch’s theorem which defines the wave 
vector in terms of which the individual electronic states are specified. Many 
determinations of energy-band structures of pure metals and semiconductors 
have now been made, notably by Slater and his colleagues. 

In disordered alloys where the potential energy of an electron is an aperiodic 
function of position Bloch’s theorem does not apply and the stationary states 
cannot be specified by points in wave-vector space. The determination of the 
individual eigenvalues, as a first step towards obtaining the density of states, 
is virtually impossible and other methods have to be sought. In recent years 
the properties of Green’s functions of electrons in solids and liquids have been 
widely investigated [see for example Edwards (1962) and Jones (1964)] and 
this analysis has shed some light on the problem of the density of states in 
disordered alloys. 

Let H denote the Hamiltonian of an electron in a space-periodic field which 
may include the special case of zero field, and let V(r) denote a potential 
energy over the same range which has no periodic properties. The Green 
function G 0 corresponding to H is then defined by 

(//-e- wj)C?o(r, r'; e) = - <5(r - r'), (1) 

and the Green function G of the aperiodic field by 

(H+V-s-in)G( r, r'; e) = — <5(r - r'), (2) 

where rj is a real constant. 

If v P k (r) denotes the Bloch wave function of the periodic field and a diagonal 
element of G with respect to these functions is denoted by G(k, e), then it 
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may be shown (Jones, 1965) that the density of states N(e) in the aperiodic 
field is given by 

N(e) = ^ J J G(k, e) d 3 k, (3) 

where J implies that the imaginary part of the integral is to be taken. 

Two real functions I(k, e) and T(k, e) may be defined by the equation 

(7(k, e) = (e — e k -I + /T)" 1 . (4) 

where e k is the eigenvalue belonging to 4 / k (r). Hence Eq. (3) may be written 

\T/\ _L_ f_ T(k, e) d 3 k _ 

W 8ti 4 J {e — e k — I(k, e)} 2 + T 2 (k, e) * U 

This relation is quite general but not, of course, useful until I and T can be 
determined. This is done by making use of the exact operator relation 

G = G 0 + G 0 VG, (6) 

and by using the value of (7 0 (k, e) in terms of the eigenvalue e k , viz. 

G 0 (k, e) = (e - e k + ir})~ 1 . (7) 

The off-diagonal matrix elements of G with respect to the Bloch states v T k (r) 
are given by 

(7(1, k) = X G 0 (\, l)F Im (7(m, k), (8) 

m 

and the diagonal elements by 

G(K k) = (7 0 (k, k) + X ^o(k, k) F kI (7(l, k). (9) 

Since (8) applies only for 1 #k the diagonal element in the sum (9) must be 
removed before substituting for (7(1, k) from (8). This gives 

(7(k, k) = (7 0 (k, k) + (7 0 (k, k)F kk (7(k, k) 

+ X G 0 (k, k)F kI X (7 0 (1, l)F Im (7(in, k). (10) 

l^k m 

So far no approximation is involved, but from this stage onwards all off- 
diagonal elements of (7 in (10) will be neglected, on the grounds that according 
to (8) they would introduce higher than second powers of the off-diagonal 
elements of V. Hence to this approximation 

(7(k, k)[l — (7 0 (k, k)F kk — (7 0 (k, k) X I^kil 2 ^o0, 0] = (7o(k, k) (11) 

1^ k 

which may be written 

G J (k, k) = (7 0 ^k, k) - F kk — X |E kI | 2 (7 0 (l, 1), 

l#k 


(12) 
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and thus by (4) and (7) 


„ , r v l^.l 2 

V kk + lim 2, -—- 

ri-+ 0 l*k £ — 8 X + If] 


= 2 (k, e) - iT(k, e), 


(13) 


which is the equation determining Z and T the functions required to give 
N(s ) according to Eq. (5). 

Two widely different examples show the generality of (5) and (13). In the 
first example, H 0 is taken to be the field-free case of completely free electrons, 
and V a periodic potential. The matrix element V* are therefore the Fourier 
components of the potential V and vanish unless 1 = k + K, where K is a 
reciprocal lattice vector of the periodic field. Since the denominator of the 
sum on the left of (13) does not vanish for any term in which V k k+K is 
finite the sum is real and T = 0. 

To evaluate (5) in this case let a = e — 8 k — Z (k, a), and choose for the 
volume element d i k the expression dS/\Va\ da where dS is an element of 
area on the surface a = const. As T tends to zero the expression T/n(a 2 + T 2 ) 
behaves like the delta function <5(cr) and consequently (5) reduces to 




dS 


(2ti) 3 -UolVal’ 
where the integration is taken over the surface a = 0, i.e. 

I K 


~ ~ V kk ~ I 


k.k + Kl 


= 0 . 


K e — e k + K 


(14) 


(15) 


Equation (14) is the well-known expression for the density of states of 
energies given by (15). Equation (15) is the Wigner-Brillouin perturbation 
series which, unlike the Rayleigh-Schrodinger series, correctly describes the 
behavior of s to second powers of the matrix elements of V, even in the 
neighborhood of the energy gaps. 

In the second example, H 0 is taken to be the Hamiltonian of electrons in a 
periodic field, and V to be an aperiodic addition. The corresponding physical 
situation is that of electrons in a disordered alloy. The orthonormal functions 
used to transform the Green function are the Bloch wave functions of the 
periodic field, of which the eigenvalues are a k . 

Let U(t - R.) denote the difference in the potential energy of an electron 
when a solvent atom at R,- is replaced by a solute atom at the same lattice 
site. Then 

V(r) = I U(r - Rd, (16) 

i 

where the summation is over n values of R,- corresponding to the n solute 
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atoms in the lattice of unit volume which contains N lattice-sites. Hence 

y k i = I [w-»( 1? ) 

i 

and by making use of the property possessed by all Bloch functions, viz. 
4> k (r + R f ) = e ik R ‘ T k (r), where R, is a lattice vector, it follows that 

V k = ^ki Z exp[/(k - 1) * R/], ( 18 ) 

i 

where </> kl is a matrix element independent of R ; which is equal to the integral 
in Eq. (17) when R, is put equal to zero. Hence 

I K\\ 2 = l^kil 2 Z exp[/(k - l)*(R f - R,)]• (19) 

‘,j 

The sum may be evaluated by summing first over all r y = R, — R j for given 
R, . It may be assumed that for different R, these sums will be equal if the 
usual procedure is adopted of applying periodic boundary conditions. Hence, 
if p(rj) denotes the probability of finding a solute atom at Rj , given that one 
exists at i?,, it follows that 

|F kl | 2 =ltf>kil 2 {" + n Z/?(r y )exp[/(k - l)-r y ]J, (20) 

where 

Z P(jj)-n-l. (21) 

j 

If the solute atoms formed a perfect superlattice p(rj) would be unity for each 
superlattice vector r y and zero otherwise. The sum in Eq. (20) would be over¬ 
all lattice vectors except 0, and therefore equal to — 1 for arbitrary k —1 and 
hence for this case V kl = 0. However, if k — 1 were a reciprocal superlattice 
vector each exponential term would be unity and the sum equal to n— l.Thus 
in this case l^,! 2 = n 2 \(f> kl \ 2 . For given k the quantity | ^ kI | 2 can be regarded 
as a function of position 1 in reciprocal lattice space. When a superlattice exists 
this function is zero everywhere except at a number of isolated points. When 
the solute atoms are randomly distributed over the solvent lattice l^k.1 2 has 
quite a different form. The sum in Eq. (20) will be either zero (complete 
randomization) or of order unity when local order occurs. Thus the function 
of 1, |E kl | 2 , for each k will be finite over the whole reciprocal space, but may 
have maxima at points separated from k by small reciprocal superlattice 
distances. 

It will now be shown how this difference is reflected in the band structure. 
When |K kl | 2 is a continuous function of position the sum (13) may be trans¬ 
formed into an integral. For unit volume of the alloy the number of states 
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in d 3 \ is (2n) i d 3 I, and thus if dS denotes an element of surface on the equi- 
energy surface £,, Eq. (13) may be written 



( 22 ) 


Let 



(23) 


thus (22) gives 


Z(k,e)= V kk + 0> 

J e — e, 


(24) 


and 


r(k, e) = nF(k, e). 


(25) 


These two equations exhibit the well-known relationship between the real 
and imaginary parts of the Green function. When a superlattice exists F(k, e, ) 
will, in general, be zero and hence also T = 0. In this case there is no line 
broadening and Z is best determined directly by (13). In the disordered alloy 
.F(k, £,) is finite and hence both T and Z differ from zero. Z may be regarded 
as determining the displacement of the energy levels and T their broadening. 

To make further precise progress the detailed form of the functions |F kl | 2 
and £, must be known, but certain general properties can be inferred without 
this knowledge. 

Consider an isolated energy band for which N 0 (e) = 0 if s 0 > e > e t . Since 
the integration in (23) is taken over the equi-energy surfaces of the pure metal 
it is clear that when e < e 0 , or £ > e u .F(k, £) = 0. Consequently for the 
regions just beyond the band limits of the pure metal, the density of states in 
the alloy is given by 



(26) 


If Z(k, £) depends on k only through e k , as is the case for the scattering of 
free electrons by screened charges, then for £ < £ 0 or £ > £ x , 


m = N 0 (e k ), 


(27) 


where £ k is the root of the equation 


6 -£ k -Z(fi k ,fl)=0. 


(28) 


From this it follows that the lowest and greatest energies in the alloy band 
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are given by the roots of 


£—£()— Z(£ 0 , e) = 0, 


(29) 


and 


£ — £j — Z (8 X , e) — 0. 


(30) 


If the potentials U {r - R,) are so chosen that the diagonal elements of V are 
zero Eq. (24) may be written 



(31) 


where <| V kl | 2 ) £ , is an average taken over the surface £,. An estimate of Z for 
values of £ < e 0 is therefore given by 


2 = -<\vj 2 y^/e, 


(32) 


where jV is the number of states in the band and £ an energy of order of the 
bandwidth. According to (29) it will be seen that the bottom of the band is 
depressed by an amount given by (32). From (23) it may be anticipated that 
the value of F and therefore T will be largest at the center of the band and 
from (24) that Z is likely to be small in this neighborhood. 

The following conclusions may be drawn from the foregoing analysis: 
(a) An energy band is broadened in a dilute disordered alloy by a finite amount, 
i.e., the line broadening is not constant, (b) The maximum line width and the 
minimum displacement occur near the center of the band. These results have 
also been obtained by Sergeeva (1965). (c) Depending on the position of the 
Fermi limit in relation to the center of the band (or peak) the density of states 
may be increased or decreased by scattering. Clearly if the Fermi limit is at 
the center of the band, line broadening will reduce N(£ F ), whereas if the Fermi 
limit is near the band edge, line broadening will increase N(£ F ). (d) Order 
(either short-range or long-range) affects both the energy displacement Z 
and the line width T, and therefore the effect on the density of states may not 
be strictly parallel with the residual resistivity, (e) The present theory which 
depends upon a series development of the Green function terminated at second 
powers of the perturbation matrix elements does not lead to the appearance 
of singularities in A(e) either within or below the pure metal band. 
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I. Introduction 

When Heisenberg took the mystery out of the Weiss molecular field in 
1927 by showing that exchange effects can couple atomic magnets, it was 
supposed for a while that direct interatomic exchange would give the desired 
interaction. As time went on, it became clear that this is not the case, and 
physicists are still debating the details of the exchange mechanism in the iron 
group. Here the situation is much less clear than in the rare earths where the 
coupling between the /electrons is generally conceded to be of the Kasuya- 
Yosida-Ruderman-Kittel indirect type via conduction electrons. Thirty years 
ago, Slater (1936, 1937) suggested that in nickel the requisite interaction could 
be found in the intra-atomic exchange integral connecting d electrons. Such 
an explanation of ferromagnetism is commonly called one based on “ Hund’s 
rule coupling.” This is the origin of ferromagnetism which I personally con¬ 
sider the most likely in the iron group. So it seems appropriate in a volume 
honoring Slater to give some of my reasons. 

In recent years, a number of papers (Kanamori, 1963; Gutzwiller, 1963, 
1964, 1965; Hubbard, 1963, 1964, 1965) have presented calculations which 
suggest that in the tight binding approximation ferromagnetism can be ob¬ 
tained even without invoking Hund’s rule. It should, however, be mentioned 
that all of these authors have some misgivings whether this is really so, as a 
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rather strange band shape or density of states must be assumed. Also, Lieb 
and Mattis (1962) have shown unequivocally that with a particular model of a 
sotnewhat special character, wherein a three-dimensional potential factors 
into three one-dimensional ones, ferromagnetism cannot be obtained without 
invoking Hund’s rule. Mattis (1963, 1964) has amplified this argument further 
in two subsequent articles. The present paper may seem somewhat lacking 
in rigor, or oversimplified, in view of the very detailed work in the various 
papers we have cited, and especially in view of the very scholarly and complete 
review and discussion of the whole situation by Herring (1966) in a forthcoming 
volume of the Rado-Suhl series. However, the various articles usually involve 
considerable mathematical analysis. So it is perhaps not entirely amiss if in 
the present paper we aim to present some elementary considerations which 
make it highly probable that the Hund’s rule effect is usually a necessary 
ingredient to get ferromagnetism out of the tight binding approximation. 

II. The U, T Model 

Until the final section VI, we shall be concerned entirely with the tight 
binding approximation for a system in which there is only one orbital state 
per atom. We suppose that there is a one-electron transfer integral T associ¬ 
ated with the passage of an electron from one atom to another neighboring 
one, and that there is a promotion or Coulomb energy U if two electrons are 
on the same atom. We can take T to be negative if the wave functions are 
appropriately and similarly phased on all atoms. This is the sign behavior 
basic to all molecular theory based on the linear combination of atomic 
orbitals, and follows from the general lemma that the lowest wave function 
is nodeless. We neglect such refinements as corrections for nonorthogonality, 
and since our model involves only two parameters, we call it the U,T model. 
The Hamiltonian function, more or less self-explanatory, is thus 

^ = I + £ n ka n k _ a , (1) 

c neighbors 

where a kg , a ka are creation and destruction operators for an electron being 
on atom k, and where the index o{= +|) designates the sign of the spin in 
some given direction. The number n ka = a k(J a kg of electrons of given spin 
direction on a given atom can only be 0 or 1. The total number of electrons 
is n = • The factor \ appears in the second member of Eq. (1) because 

each polar state is counted twice in the summation. 

The superficial argument which appears to extract ferromagnetism from 
(1) if C7 is sufficiently large runs as follows: Assume that the first \n orbital 
states are doubly occupied, with compensating spins. Ferromagnetism is 
supposed to ensue if the system is unstable with respect to spin reversal, i.e., 
if the decrease in polar energy more than offsets the increase in orbital energy 


The Slater Intra-Atomic Exchange Model for Ferromagnetism 477 


associated with the T terms when we let there be more up than down spins. 
If p{E) be the density of orbital states (relative to energy) at the assumed Fermi 
surface with complete spin compensation, the increase in the interatomic 
orbital energy when a number An of electrons are promoted to singly occupied 
states is 

(A n) 2 /p(E) 

while with the molecular field approximation the change in polar energy is 

[U(fn - A nXin + An) - U(in) a ]/N = - U(An) 2 /N, 

where N is the total number of sites among which the n electrons are distrib¬ 
uted. The total energy is lowered or, in other words, the system is unstable if 

Up(E)/N> 1. (2) 

It is generally recognized that the condition (2) for ferromagnetism is too 
lenient. Kanamori (1963), using essentially a Brueckner-Goldstone type of 
approximation, and Hubbard (1963), using Green’s functions, find that the 
condition (2) for ferromagnetism should instead be replaced by one of the 
form 

U e ffP(E)/N > 1 , ( 3 ) 

where U e{{ is much smaller than U, because of the effect of screening. In 
particular, £/ eff is of the order T rather than U in the limit U -* oo. So ferro¬ 
magnetic instability is no longer automatically achieved simply by allowing 
U to be large. As £/ eff and N/p both are then of the order T, orders of magni¬ 
tude arguments can no longer be used to discover whether (3) is satisfied. It 
does appear from (3) that there is ferromagnetism if the density p(E) at the 
top of the Fermi surface is sufficiently large compared to the average bandwidth. 
Kanamori and Hubbard both show that certain admittedly ad hoc and 
artificial band shapes will cause (3) to be fulfilled. The flaw in this type of 
argument is that the band shape is not something which can be adjusted in 
an arbitary fashion, but instead it is a consequence of the Hamiltonian (1). 
It is the author’s belief that were it possible to make an accurate calculation, 
ferromagnetism would never, or practically never,* ensue from the simple 
Hamiltonian (1). In simple physical terms, the reason for my conviction is that 
the lower the total spin, the more polar states (i.e., states with two electrons 
on the same atom) can be intermixed into the wave function, and the energy 
consequently lowered, as compared with a situation where doubling up on 
the atom is impossible. This is simply a manifestation of the truism that when 

* It should, however, be noted that for a face centered cubic lattice Gutzwiller (1963) 
finds that the Hamiltonian (1) with U—0 gives a logarithmic singularity in p if the band is 
nearly full, and then (3) is automatically satisfied. This is a rather special situation, but it 
makes one hesitate to say categorically that ferromagnetism can never occur for the U,T 
model. 
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low-lying nonpolar levels are allowed to interact with higher levels, they are 
depressed further. The fallacy involved in the naive use of (2) arises from the 
fact that without corrections for polarization or interconfiguration interaction, 
completely uncorrelated or itinerant wave functions, in general, mix together 
polar and nonpolar terms and so possess increased energy just because the 
energy of two electrons is raised when they climb on the same atom. Then 
the completely ferromagnetic state, where no such climbing is possible with 
our U,T model, would have the lowest energy of all if | U/T\ is large. Inclusion 
of the corrections, however, completely changes the situation. 

III. Analogy to the Case of Only Two Atoms 

The importance of interconfiguration interaction and the inadequacy of 
the Hartree or molecular field approximation underlying (2) is simply illus¬ 
trated by the familiar case of two electrons exposed to equal attracting centers 
A, B, i..e, the hydrogen molecule. We start with one-electron wave functions 
which are linear combinations, 

1 l'g = ('l'A + <Ab)/V2, <A 0 = (<Aa - iAb)/\/2, 

of atomic orbitals. We shall neglect nonorthogonality and direct (i.e., inter¬ 
atomic) exchange. If one uses a molecular field or orbital procedure analogous 
to that used in obtaining (2), one finds that the two-electron wave functions 
and their corresponding energies are 


^( 3 £„) = [<A„( 1)^(2) - <A„(2)<A g (l)]/V2, 

E= 0, 

= [<A„(1)<A,(2) + <A u (2)<A g (i)]A/2, 

E=U, 

*(%) = ^(1)^(2), 

E = $U+2T, 

*(%) = UW,(2), 

E = \U -2T. 


From cursory examination of these equations it would appear that the triplet 
would be lowest if U is large. This situation would be the two-center analog 
of ferromagnetism. The flaw in this calculation is that it has neglected matrix 
elements connecting the two states, which makes them have energies 
given by 

E(%) = \U±{\U 2 +AT 2 y 2 (5) 

and the lowest member of the pair is deeper than the triplet regardless of the 
values of T and U. 

In my opinion, as long as one sticks to the Hamiltonian (1), essentially this 
state of affairs persists, i.e., interaction makes the state of minimum spin 
lowest, even when there is a large number of sites, which can in general 
exceed the number of electrons (or alternatively of holes) in the band. Of 
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course it is dangerous to make such a broad, and rather intuitive, generaliza¬ 
tion from a rather trivial example; Herring even says that the evidence is 
only “fragmentary” that ferromagnetism can never be extracted from (1). 
It should, however, be mentioned that the mere fact that alkali or other metals 
characterized by nondegenerate bands are never found experimentally to be 
ferromagnetic is evidence of a sort. From the theoretical standpoint, the 
following line of consideration seems to me to give some assurance that, at 
least ordinarily, ferromagnetism does not flow out of the Hamiltonian (1).* 

IV. The Case n = N of One Electron per Atom 

When there are a large number of atoms, the simplest case to consider is 
where there is just one electron per atom, so that n = N. We start now with 
the completely non-polar configuration, and regard the polar configurations 
where two electrons double up on the same atom as perturbations. The non¬ 
polar configuration has zero transfer energy, as there is no possibility of an 
electron jumping from one atom to another without introducing polarity. 
Furthermore, this zero energy configuration is completely independent of 
spin, as there is no interaction of any sort. Now introduce the interaction with 
excited states as a perturbation, and consider only terms of the order T 2 IU. 
An electron on one atom can slide over and double up with another electron 
on an adjacent atom, at the expense of introducing a polar energy U, only 
if their spins are antiparallel or, more exactly, only if, regarded as a two- 
electron system, the pair is in a singlet rather than a triplet state. Since the 
triplet and singlet states have respectively sgsj = i, — J, this restriction can 
be expressed by saying that the operative nondiagonal transfer term taking a 
given electron on atom A to a neighbor atom B already occupied by one elec¬ 
tron is 

The factor Jl comes from using wave functions of the proper symmetry as 
regards permutation between A and B. The second-order effect of a pertur¬ 
bation by upper states j which lifts the degeneracy of a family of originally 
consident energy levels is to introduce an effective Hamiltonian 1 

</i/ e , jo = -Yj [<i\^\jXjm'>/^ji\- (6) 

* The procedure which is to be presented in Sections IV and VI, whereby perturbation 
theory is used to take account of the repercussions of the states of higher polarity and to 
generate an effective Hamiltonain for the states of minimum polarity, was briefly explained 
by the writer in two earlier review articles (Van Vleck, 1953, 1957). In particular, Eq. (8) 
of the 1953 article is the same as the Eq. (8) of the present paper. This line of attack is 
seldom noticed, perhaps because it was previously presented only rather incidentally in the 
discussion of other magnetic problems, and so is now elaborated in more detail. 

t For prodf of the relation (6) see, for instance, Kemble (1937). 
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As hvji— U, we thus see that the effective Hamiltonian introduced by the 
creeping in of polarity is, to lowest order in T 2 /U, 

jr eff =-2(2 T 2 )U~ l X a-s*-s,) 2 . 

neighbors 

The extra factor 2 appears because an electron can jump from A onto B or 
from B onto A. Since (s k ’S t + £) 2 = i, this relation can be written 

•*%fr = (4 T 2 /U) £ (s**s,-i). (7) 

neighbors 

The important thing is that the coefficient 4T 2 /U of is positive, and so 
(7) represents an antiferromagnetic rather than a ferromagnetic situation. 
Consequently the lowest state will be that of zero or minimum spin. This can 
also be seen qualitatively from the fact that there are more different orienta¬ 
tion possibilities, and so more depression of the ground state by interaction 
with the polar upper states, the lower the resultant spin. The state of maximum 
spin, in particular, cannot interact with any polar states at all. 

V. The Case n # N of Unequal Numbers of Electrons and Sites 

In the more general case* where the number of electrons is less than or 
greater than the number of sites, we have not been able to construct with any 
degree of rigor a proof that the lowest state for our U,T model is that of 
minimum spin. If one tries to construct a proof along the lines of that for the 
special case n = N, one notes, first of all, that the manifold of non-polar 
states (or of minimum polarity if n > N ) has a band structure, since the n 
electrons can redistribute themselves among the N various sites without 
introducing any polarity if n < N, or more than the minimum unavoidable 
amount U(n — N ) if n > N. Furthermore, this band structure will show some 
dependence on spin. This fact seems rather surprising at first, as the only 
constant entering in the problem is the one-electron transfer integral T. The 
reason is that by successive one-electron jumps one can pass from one 
configuration to another differing from it only by a permutation, and as a 

* An important difference should be noted in the physics of the situations discussed in 
Section V as compared with Section IV. In the ideally non-polar configuration for n = N, 
no electrical conductivity or specific heat is possible, as no redistribution is possible. On 
the other hand, when n ^ N there can be conductivity and specific heat even in the manifold 
of minimum polarity, since electrons can redistribute themselves among the sites without 
changing the number of unoccupied ones if n < N, or of doubly inhabited ones if n > N. 
When n = N, conductivity and specific heat can begin to appear as soon as U is no longer 
treated as extremely large compared to T. That appreciable conductivity sets in rather 
suddenly when THJ exceeds a certain critical value has been stressed by Hubbard (1964). 
This corresponds to the physical fact that the distinction between insulators and conductors 
is usually a rather sharp one. 
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result the energy is correlated with the symmetry types of the orbital permuta¬ 
tion group, and hence, because of the exclusion principle, with spin alignment. 
As a simple example consider two electrons distributed in a lattice consisting 
only of three equivalent sites situated at the vertices of an equilateral triangle 
(ABC). If (120), for instance, denotes an unsymmetrized state where elec¬ 
trons 1,2 are respectively on sites A,B, then by one-electron jumps one can go 
through the sequence (120) -»■ (102) -> (012) ->(210). When one introduces the 
proper symmetric or antisymmetric combinations, it turns out that the state 
of lowest energy is that of zero spin. For the more general problem of a ring 
containing two electrons and n sites, one finds that the singlet is usually 
deepest, but that for certain values of n, the deepest singlet and triplet may 
coincide. For a quadrilateral (ABCD) with three electrons, one of the states 
with S = ^ has a lower energy than any with S = ■§. We omit details of the 
secular equations that lead to these results, as the calculations are elementary. 

When we come to a real lattice with three dimensions and a large number of 
atoms, the dependence on spin of the states of minimum polarity appears to 
be a complicated topological problem involving the various types of links 
which can be constructed, analogous to Feynman diagrams, and we are unable 
to make any definite statements. However, we believe that the dependence on 
spin is usually a rather minor part of the band structure, since only a small 
fraction of the various successive transfers lead to configurations differing 
from the original one only by a permutation.* Probably a state of minimum 
spin is lowest, as low spin states are the most numerous and give the most 
possibility of being pried apart by the permutation effect. In any case we can 
invoke the perturbing influence of the states of less than minimum polarity 
to give antiferromagnetism. The large number of terms of order T 2 /C/ presum¬ 
ably outweighs the small fraction of those of order T that are spin dependent. 
One can still include the second-order effect of the excited states by introducing 
the effective Hamiltonian (7) with, of course, the understanding that the sum 
goes only over the pairs of adjacent sites that are both occupied in some 

* For a linear lattice, there can be no permutations of the type we are discussing in 
the state of minimum polarity, since one electron cannot pass another without raising the 
polarity while passing. Hence, there can be no dependence on spin until one considers the 
perturbing effect of upper states of the lattice, and it is this effect which makes the system 
behave in accordance with the Lieb-Mattis theorem, as of course it must, since the problem 
is one-dimensional. This situation is reflected in the fact that Slater et at. (1953) find that 
for a U,T system with only two electrons free to redistribute themselves along a linear 
chain, the singlet separates from and falls below the triplet only one when one includes 
terms of the order T 2 IU. On the other hand, for the two-dimensional 3x3 problem which 
they treat and which we discuss later, the excess of energy of the triplet over the singlet is of 
the order T, a manifestation of the fact that there is the permutation effect, and interaction 
with upper configurations need not be invoked to pry apart the states of minimum 
polarity differing in spin. 
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configuration of minimum polarity (zero polarity if n £ N). In addition there 
is another effect not found in the special case n = N. Namely, if sites A and B 
are originally singly occupied and C is empty, and if A and C are both neigh¬ 
bors of B, then an electron can pass from A to C by going through an inter¬ 
mediate state in which two electrons are on B. This will lead to effective 
transfer terms of the order —T 2 IU whereby an electron can make double 
jumps. These terms are negative, and therefore presumably bonding. Also, 
they are present only if the spins of A and B are oppositely directed. So pre¬ 
sumably these new transfer terms tend to strengthen the antiferromagnetic 
tendency. It seems, however, somewhat questionable to consider them at all, 
since we are neglecting all interactions except those between nearest neighbors. 

We have discussed primarily the case that U is large compared to T. In 
the other limit that U is very small compared to T, the nonmagnetic state is 
clearly lowest, as the total energy is minimized if the deepest states in the band 
are doubly occupied. If the transfer effect is finally widened in scope to give 
free electrons (i.e., U/T = 0, and the T terms finally flow over into the kinetic 
energy terms, one gets the Pauli feeble paramagnetism, as long as true 
exchange is neglected. The antiferromagnetism which we obtain with the 
U,T model for large U can therefore be regarded as the counterpart for tight 
binding of the Pauli feeble paramagnetism for loose or no binding. 

If ferromagnetism does not ensue from the U,T model if U/T^> 1 or U/T 
< 1, one suspects that this is also the case for intermediate ranges of T/U, 
but of course this is no rigorous proof. In this connection, there are two 
further papers that can be mentioned which cover the intermediate range with 
certain approximations. Slater et al. (1953) have shown that for a two- 
dimensional model of a 3 x 3 lattice containing 9 sites in all, the singlet is 
lower than the triplet. This is some indication that in slightly filled bands (or 
slightly empty ones, using holes instead of electrons), there is no ferromag¬ 
netism with a U,T model, inasmuch as the simple pair effects presumably 
dominate the cooperative behavior, if only a relatively small number of sites 
are occupied. Unfortunately, comparable calculations for a three-dimensional 
lattice are wanting. Kikuchi (1953) has treated an arbitrary lattice of similar 
sites by a procedure which is closely allied to the constant coupling approxi¬ 
mation of Kasteleijn and van Kranendonk (1956) and Nakamura (1953). 
The difference is that he allows migration within the pair which is taken as the 
fundamental cluster. He finds that the U,T model cannot give ferromagnetism 
with his approximations. 

VI. Effect of Degeneracy and Hund’s Rule Coupling 

How do we get ferromagnetism if it is not provided by the U,T model, and 
if direct interatomic exchange is negligible ? It can creep in when we remove 
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the restriction that there is only one orbital state per atom, and allow the 
polar states of high multiplicity to be considerably lower than those of lower 
multiplicity. The non-polar states with parallel spins then can be more de¬ 
pressed by perturbation from polar states than those with antiparallel ones 
simply because the denominators in the formula for the perturbed energy are 
smaller, even though the “ line-strength ” associated with the perturbation is 
greater if the alignment is antiparallel. This can be seen by looking at a rather 
schematic model with, say, five orbital states per atom, as in the d-shell, and 
with the orbital magnetism quenched by the crystalline field, so that we need 
consider only the magnetic effects of spin. We will assume that there is one 
electron per site, so that one has the simplest case n = N. We also add the 
rather naive assumption that the transfer integral T is independent of which of 
the five states the electron lands up on when jumping from one atom to another. 
One can easily convince oneself that the perturbing line strength is f as great 
in the singlet as in the triplet case, since the Pauli principle does not allow a 
given electron to land in an already occupied state, so that, so to speak, for 
of the transitions the line strength is taken out of the triplet and put in the 
singlet. If we further assume that all the triplet states of a doubly occupied 
atom have an energy lower than that of the singlets by an amount 2 J 0 , the 
formula for the effective Hamiltonian analogous to (7) is 



and ferromagnetism can ensue if J 0 > U/5. We have used the notation J 0 for 
the intra-atomic exchange integral in order to distinguish it from the inter¬ 
atomic exchange integral, which is so commonly denoted by J. 

According to this view, ferromagnetism can arise only if the intra-atomic 
exchange integral J 0 is not too small compared with the mean polar energy U. 
Estimates of U seem to be steadily diminshing with time. The question of the 
size of U has been reviewed and considered very carefully by Herring (1966), 
who concludes that U may be only a matter of a volt or so. Among other things 
a polar lattice has a favorable Madelung energy, and there can be screening 
of the intra-atomic interaction by penetration by ^ or p conduction electrons. 
This screening effect will, incidentally, diminish J 0 as well as U. When one 
gets to a situation where U, J 0 , and T are of the same order of magnitude, 
one is in a difficult region, for neither an uncorrelated band model nor the 
localized one such as we have employed is a good starting point for a pertur¬ 
bation calculation. Unfortunately, one seems to be in just this state of affairs 
in the iron group, and everything that I have said should be regarded as 
more schematic than realistic. It is simply my feeling that the Hund’s rule 
coupling as originally proposed by Slater is usually an essential ingredient. 
It may well be (a possibility suggested by Herring, 1966) that there is a near 
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cancellation of many terms, and that the Hund’s rule coupling parameter 
i.e., the intra-atomic exchange integral, even when diminished by screening, is 
what tips the scales in favor of ferromagnetism. Whether this coupling is also 
necessary to get localized magnetic states in an otherwise nonmagnetic 
conductor is a somewhat different question, but the presumption is that if it 
is responsible for the ferromagnetism of d electrons, then it is also probably 
needed for them to produce localized states. 
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I. Introduction 

Slater (1937) was the first to describe the properties of spin waves on the 
basis of energy band theory. His discussion was applicable to ferromagnetic 
insulators and, under some approximations, leads, among others, to the 
following result: 

If the energy h(o q of spin waves with wave vector q is expanded in even 
powers of q, then the leading term, corresponding to long wave lengths, is 
essentially of the form 

hco q = Dq 2 , (1) 

where 

D = }za 2 (J-2W?/I). (2) 

Here a is the interatomic distance, z the number of nearest neighbors, J the 
Heisenberg exchange integral for nearest neighbors, W r the interatomic 
hopping integral determining the bandwidth and / an energy whose major 
contribution is the intraatomic Coulomb interaction. The second term in 
Eq. (2) is in the nature of a superexchange interaction energy, and Slater’s 
analysis thus predates Anderson’s discussion (1950, 1959) of the importance 
of superexchange in magnetic insulators. 

Since 1937 there have been several important developments in the theory 
of spin waves, and it seems appropriate on this occasion to summarize some 
of these, with particular reference to metallic ferromagnets. 
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The first development to be noted is the establishment of the itinerant 
electron model of ferromagnetism in metals. Some of the basic ideas of this 
model had, in fact, also been proposed by Slater (1936a,b), but it was left to 
Stoner (1938, 1939) to put these ideas on a more systematic basis. Briefly, 
the model combines the use of Fermi statistics, the concept of single particle 
excitations in energy bands, and the use of a molecular field to represent the 
interactions between the single particles. The theoretical status of this model 
was discussed by Wohlfarth (1953) and more recently in great detail by Her¬ 
ring (1966). Among several relevant results of this work is the condition for 
ferromagnetism of the single particles, the so-called Stoner criterion, 

IN(E f ) > 1, (3) 

where N(e) is the density of single particle states per atom and E F the paramag¬ 
netic Fermi energy. It has, however, been shown (Wohlfarth and Rhodes, 1962; 
Shimizu and Katsuki, 1964; Shimizu, 1964) that a more searching analysis 
than hitherto, leading in particular to a more reliable criterion for ferromagne¬ 
tism, is obtainable by considering the total course of curves relating the total 
energy £ as a function of the relative magnetization £ at 0°K. This type of 
analysis shows that ferromagnetism may set in even where (3) is not obeyed. 
A discussion of £, £ curves is given in Section II which includes an example of 
the insufficiency of (3). 

Another development is the theoretical discussion of the properties of 
spin waves in metals. Their very existence in metals was at one time taken to 
obviate the itinerant electron model, but such is no longer the case. First 
Herring (1952a,b) and later many others showed quite clearly that the two 
types of excitations (single particle and spin wave) are fully capable of peace¬ 
ful coexistence, and that a dispersion relation between spin wave energy and 
wave vector can be established for particular assumptions. For a particular 
set of assumptions [(1) a single band of itinerant electrons, (2) the only inter¬ 
action is the short range, intraatomic interaction which is reduced from / 
to 7 eff by correlation and/or conduction electron screening, (3) the interaction 
entering the expression for the spin wave energy is the same as that determin¬ 
ing the properties of the single particles] the dispersion relation is of the form 
(1) but with D containing, apart from the superexchange term in (2), a kinetic 
energy term corresponding to the motion of the single particles and zero 
only in the insulating limit. The first term in (2) does not occur for this set 
of assumptions. 

A brief discussion of spin waves in metals and the derivation of D (see, for 
example, Izuyama and Kubo, 1964; Englert and AntonofT, 1964) is given in 
Section III. The resulting formula is, however, unwieldy as complicated sum¬ 
mations over the single particle k-space are involved. Hence work is now in 
progress to calculate D and to assess its properties for some simple forms of 
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the single particle energy and for limiting values of the relative magnetization 
(• Some of the results already obtained are discussed in Section IV. It is found 
that in some cases D is positive if (3) or a more relevant criterion is obeyed 
and that in some of these cases D vanishes together with £. Elsewhere, D 
may be negative under these circumstances so that, in line with the discussion 
of Mattis (1965) and others, the ferromagnetic ground state must be unstable 
compared with one corresponding, presumably, to antiferromagnetic ordering 
or a spiral spin configuration. In one case discussed in Section IV both the 
criterion for ferromagnetism of the single particles and the sign of D indicate 
a stable ferromagnetic ground state although this seems to be forbidden here 
by the Lieb-Mattis theorem (1962). Hence even more searching methods are 
needed to test the ground state for stability. (See Penn (1966) and Katsuki and 
Wohlfarth (1966) for more recent work.) 

A third development since 1937 has been the introduction of various experi¬ 
mental measurements of D. One of the most fruitful of these involves the 
small angle neutron scattering technique (Hatherly et al., 1964; Shirane et al., 
1965); this has been applied to a wide range of ferromagnetic metals and alloys 
and published and unpublished data are reproduced in Section V (by courtesy 
of Lowde, Stringfellow, and their colleagues). The question arises how these 
data are to be interpreted in the absence of reliable calculated values of D 
corresponding to the band structure of these alloys. As discussed semi- 
quantitatively in Section IV, for want of a more reliable comparison it seems 
not unreasonable to correlate D roughly with the Curie temperature T c . This 
correlation is shown in Section V to be as good as can be expected. 

II. Itinerant Electron Ferromagnetism at 0°K 


On the basis of the premises listed in Section I it is possible (Wohlfarth and 
Rhodes, 1962) to obtain the total energy of a ferromagnetic metal as a 
function of the relative magnetization ( at 0°K in the form 


£(0=j 

[ Ef eN(e) ds — ( Ef eN(e) de — \nk B Qf 2 , 
e f Je f - 

(4) 

where 



M>J 

f ll N(e) de = N(e ) de. 

(4a) 

Here n is the number of particles per atom, Ep are the Fermi energies for the 

+ spins, and 

k B 0 ~ 

(5) 

Minimizing £(£) gives 

Ep — Ep — AE = 2 k B d'C, 

(5a) 
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this being the so-called exchange or molecular field splitting (Wohlfarth, 
1965). Ferromagnetism then occurs if E(Q has a maximum at £ = 0 and a 
minimum or extremum at a finite value of (. Since 


<?m i 2 ( i i 

d( 2 4 n \N(E+) N(Ep ) 


nk B d', 


( 6 ) 


this is the case if (3) is satisfied (I-> However (Shimizu and Katsuki, 

1964; Shimizu, 1964) the function E(Q may well have a minimum at ( = 0, 
so that (3) is not satisfied, but reach a further minimum or extremum at a 
finite value of ( where 

e( o < m- (v 

Apart from the question whether (7) really implies that this second equi¬ 
librium state is indeed invariably attained, the possibility thus exists that the 
Stoner criterion is not always a reliable guide to the occurrence of ferro¬ 
magnetism. 

The above authors show that this situation is likely to occur if E p lies near 
a minimum of the N(e) curve, and it is thus of interest to consider as an 
example the unrealistic case of a linear metal in the tight binding approxima¬ 
tion as here the single particle energy e(k) used in the calculation of D in 
Section IV is known and the N(e) curve has a positive curvature. Here 

e(&) = W sin 2 iak, N(e) = {7ie( W — a)} _1/2 , (8) 

where W is the bandwidth. If /? = \nn, then 

E(Q = -W sin £{1 - cos(jffC) - sin J?)C 2 }, (9) 

n 


where j = 7 eff / W. The Stoner criterion is thus j > j s , where 

j s = sin /?, (10) 

while ferromagnetism actually occurs when j > j c , where 

Jc =j s {2 sin (i/?)//?} 2 < j s . (11) 

For other forms of e(k) and N(s), such as a band where the curvature of N(e) 
is predominantly negative, the Stoner criterion is, however, relevant. 


III. Summary of the Theory of Spin Waves in Metals 

For the set of assumptions listed in Section I the Hamiltonian may be 
expressed in terms of destruction and creation operators c, c* in the form 
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(Herring, 1966; Izuyama and Kubo, 1964; Englert and Antonoff, 1964; and 
others) 


^ ~ Z £ ( k ) C kff C kff + ~ Z ^( ( l) C k + q<T C k'-q<T' C k , ff' C k<T • 

k,cr ^-qkk'crrr' 


( 12 ) 


Here e(k) is the single particle energy, a, o' are spin operators and V(q) is 
the Fourier transform of the interaction energy. In later applications this 
will be taken to be determined only by the intraatomic interaction / eff , 
when F(q) is independent of q. The equation of motion for the operator S q (k), 
which corresponds to the removal of a particle with + spin in a state k to a 
state k + q and a reversal of the spin to —, and which is thus given by 


*Sq(k) = c* k+q _c k+ , (13) 

is 

ihs q (k) = [s q (k),^y, (14) 

this may be solved by using the random phase approximation. In terms of 
unknown coefficients a q (k, to), which define the spin wave operator t] q (co) 
associated with a frequency co and are given by 

7q0>) = Z «q( k * W )Sq( k )> 05) 

k 

the equation of motion is transformed to 


[hco q - £ (k + q) + £ (k) - 2F(0)CK(k, co) 


+ -Z (ft - A _ - + ,)nq)s(k>) = o, 


so that 


1 + ? via) Y - — ^ k+q - 

n W t hco q - e(k + q) + e(k) - 2 F(0)C 


= 0 , 


(16) 


(17) 


where the f ± are Fermi functions and C is the relative magnetization. By 
considering the behavior of (17) in the complex co — plane it may be shown that 
(17) has two types of solutions: (1) Excitations of the single particles in the 
energy band, 

hco q = s (k + q) - e(k) + 2 F(0)£; (18) 

these commence at q = 0 at an energy equal to the exchange splitting 


AE = 2 F(0)C = 2 k B df = n/ eff C (19) 


above the ground state. (2) Spin waves which are low lying and have an energy 
which vanishes at q = 0 and which departs from the origin for small q 
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according to (1), with D now given (for a cubic metal) by 




( 20 ) 


The first term in (20) is equivalent to the first term in (2) and involves inter¬ 
actions of the Heisenberg type which will no longer be considered in the fol¬ 
lowing. The second term is the kinetic energy term which vanishes in the 
insulating limit and thus does not occur in (2). The last, superexchange, term 
in (20) does occur in (2), in rudimentary form. 

In order to calculate the complete spin wave spectrum, (17) must be 
considered as it stands. In this way higher order terms in /ico q (recently observ¬ 
ed by Shirane et al., 1965; Weber and Tannenwald, 1965) could be calculated. 
Eventually the two solutions of (17) will merge when q = q max , say; for q > q max 
the spin waves are damped with a short life time. 


IV. Theoretical Calculations of D 


Equation (20) for Z), with the first term excluded in the present approxima¬ 
tions, is a complicated expression involving, as it does, intricate summations 
in k-space. Thus, though these approximations are rather restrictive, it is 
even then not easy to obtain reliable values of D for any specific ferromagnetic 
metal or alloy since e(k) curves and their derivatives are not generally known 
with sufficient accuracy and since interband effects must in any case be very 
important. The attack on D has therefore had to be two-pronged: (1) Calculate 
D for some simple forms of e(k) in order to gain some feeling for the proper¬ 
ties of this function; (2) attempt, on the basis of such calculations, to obtain 
some other property of real materials with which D may be expected to have 
some degree of correlation. 

The first type of energy band used in calculating D corresponded to the 
effective mass approximation (Herring, 1952b; Thompson, 1963, Mattis, 
1963). It is found that at 0°K, and with £ < 1, 



( 21 ) 


where 


g(x) = | [(1 + x) 5 ' 3 - (1 - x) 5/3 ]/[(\ + .v) 2 ' 3 - (1 - .v) 2 ' 3 ], 
while if £ = 1, i.e., A E > 2 2/3 £>, 



( 21 )* 
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If C is small, expansion of (21) gives 

D = h 2 (/lSm* + 0(C 3 ), (22) 

so that here D vanishes exactly where C vanishes, i.e., where I ef{ N(E F ) = 1. 
It may be shown (Doniach and Wohlfarth, 1965) that the proportionality of 
D with C in this limit is a general consequence of the form of (20) and is not 
just a consequence of the effective mass approximation. If the Stoner criterion 
is not relevant, however (see Section II), the situation is more complicated, as 
discussed below. Where this difficulty does not arise it has been found by 
Thompson et al. (1964) that the Curie temperature T c and the value of C 
at 0°K are proportional in this limit. Hence 

D ~ k B T c a 2 f(n) + 0(T 3 ), (23) 

where a is the lattice constant and n the number of electrons per atom. For 
the effective mass approximation 

D = (nl6j2)(k B T c /k 2 ) + 0(T 3 ), (23)* 

where k F is the Fermi momentum, so that here f{n) ~n~ 2/3 . Since for most 
ferromagnetic alloys T c varies more rapidly with composition than does n 
it appears that some degree of correlation is to be expected between D and T c , 
even where this is not small. This correlation, derived here as a rough guide 
from a preliminary consideration of the form of the spin wave spectrum in the 
weakly magnetic metallic limit, is also found in what may be regarded as 
the opposite limit, that where the Heisenberg model alone is applicable. 
Here (Hatherly et al., 1964) 

D~k B T c a 2 /(S+ 1), (24) 

where S’ is the localized spin. The degree of correlation between D and T c 
actually found is shown in Section V. 

Further calculations of D are needed for other types of energy band so as 
to see how general are the results for the parabolic band quoted above. As a 
first attempt D was calculated for the unrealistic case of a linear metal where 
e{k) and N(e) are given by (8). It is found that at 0°K, 


D 

Wa 2 




(25) 


where W is the bandwidth, j = 7 eff / W, and p = $nn; it is assumed that 
j > j c , where j c is given by (11), when ( = 1. The first term in (25) gives the 
kinetic energy contribution which is seen to vanish correctly when n = 1, 
i.e. in the insulating limit. The second term gives the superexchange contri¬ 
bution analogous to that in (2); when the zone is full {n = 1, C = 1) 


D = - W 2 a 2 /8I eff . 


( 26 ) 
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On the other hand, if n is small the superexchange term also becomes small 
and D is determined mainly by the kinetic energy term; if n -»0 (£ = 1), 

D = ±Wa 2 . (27) 



Fig. 1. Variation of coefficient D with electron/atom ratio n of a linear metal. Numbers 
on curves j = / cff / fV; I eff , effective interaction; W, bandwidth; a, interatomic distance. 

Fig. 1 shows the relation between D/Wa 2 and n for a range of values of 
j; for each of these there is an upper limit to n determined by the condition 
j > j c (this effect results from the form of the N(e) curve). Beyond this range 
the metal is paramagnetic (( = 0). For another range of n, as shown, D 
is negative so that the ferromagnetic ground state is unstable; here the metal 
is presumably antiferromagnetic or has a spiral spin configuration. Finally, 
for small values of as already implied by (27), D is positive and j > j c , so 
that on the basis of these two criteria alone the metal should have a ferro¬ 
magnetic ground state. Since the Lieb-Mattis theorem (1962) forbids this in 
the present case, however, it seems that further criteria for ferromagnetism 
must be sought, both here and possibly also in more realistic eases. It may 
be. for example, that, although D is positive, the whole spin wave spectrum 
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hco q reaches negative values at some larger values of q. On the other hand, 
the Lieb-Mattis Theorem does not apply in the presence of Heisenberg 
exchange. It is thus of interest that the exchange integral J needed to stabilize 
the ferromagnetic ground state for the above calculation is small for all three 
values of 7 eff shown in Fig. 1, where the largest negative value of D is only 
about -0.1 Wa 2 . 

Calculations of D are now in progress (Katsuki and Wohlfarth, 1966) for 
three-dimensional models of a metal, and diagrams of the type shown in 
Fig. 1 are being obtained. The problems treated are not unrelated to those 
discussed by Penn (1966) regarding the stability of magnetic metals. 



Fig. 2. (a) Experimental values of D for ferromagnetic alloys (for references, see text). 


V. Experimental Values of D of Ferromagnetic Alloys 

Measurements of D using the small-angle neutron scattering technique 
were made as follows: B.C.C. Fe—Ni, Fe—Co, Fe—Cr, Fe—V (Lowde etal., 
1965), Fe—A1 (Antonini et al., 1966), F.C.C. Fe—Ni (Hatherly et al., 1964), 
Ni—Pd (Stringfellow and Torrie, 1964), Ni—Cu (Stringfellow, 1965). Values 
of the Curie temperature T c are given in the normal literature and have been 
collected for the present purpose by Stringfellow. The variation of D and 
T c is shown in Figs. 2(a) and 2(b), respectively. Apart from some obvious 
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Fig. 2. (b) Experimental values of Curie temperature T c . 


deviations between the two sets of curves the over-all qualitative correlation is 
reasonable, but not sufficiently so that the theoretical discussion of this paper 
can be said to be entirely vindicated. Clearly, further experimental and theoret¬ 
ical work in this rewarding field is necessary. 
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I. Introduction 

It has become apparent that the understanding of metallic ferromagnetism 
and antiferromagnetism is a central problem at present in solid state physics. 
In the study of metallic ferromagnetism two extreme models, the itinerant 
model and the localized model, are usually adopted. Since neither of them 
seems to be adequate for accurate description of d-electrons in transition 
metals, a number of investigators are making an effort to find a way in the 
middle of these two extremes. 

It must be noticed, however, that the one-electron band theory still seems 
to be important, because it allows qualitative, sometimes semiquantitative, 
descriptions of many properties of transition metals. Further, it is hoped that 
the one-electron band model is useful as a starting point to more sophisticated 
treatments. Therefore, it seems to be adequate to work out the band structure 
of transition metals, especially in the ferromagnetic and antiferromagnetic 
states, with a detailed examination of the basic assumptions on which the 
band structure is derived. 

Recently, the shape of the Fermi surface of some transition metals has been 
elucidated experimentally by several investigators using various powerful 
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methods, so that it becomes possible to examine whether or not the band theory 
is able to produce the shape of the Fermi surface determined by experiments. 

It is a somewhat complicated problem to calculate the band structure of 
transition metals. Both the augmented plane wave method (Slater, 1937) and 
the Green’s function method (Kohn and Rostoker, 1954), however, have 
proved to be sufficiently accurate ones. The extensive study on the band 
structure of transition metals by the APW method has been done by Slater’s 
group (Slater, 1965). On the other hand, the GFM was applied to copper by 
Segall (1962) and to transition metals by the authors of this article (Yamashita 
et al., 1963). Although the wave equation is accurately solved in a given 
potential, it remains a difficult problem to choose a suitable crystal potential. 
Unfortunately, we have not yet found a practical method to derive the best 
one-electron potential from the first principle. The determination of the 
crystal potential will be the main subject in Section III. 

In this paper, we shall mention the band structure of ferromagnetic iron 
and nickel, and the antiferromagnetic chromiun putting special emphasis on 
the shape of the Fermi surface. 

Another important problem is to elucidate the electronic structure of 
transition-metal alloys. In general, the problem is a very difficult one, because 
the mathematical difficulties inherent to a disordered system prevent the 
proper development of the theory. The electronic structure of a simple 
superlattice of transition elements is, however, easily evaluated by the APW 
method or by the GFM. Here, we shall mention the results of calculation on 
the electronic structure of the ferromagnetic superlattice CoFe. 

II. Methods of Computation 

The computational procedure used here is the Green’s function method. 
The wave function is expanded within an inscribed sphere, and the 1 = 0, 1, 
and 2 components are considered. The energy values are evaluated at the 
points of high symmetry in the Brillouin zone. The evaluation of the energy 
at general points takes too much time, so the interpolation method developed 
by Slater and Koster (1954) is adopted to calculate the energy value at general 
points. The density-of-states is evaluated from the energy values thus calcu¬ 
lated at many points in the Brillouin zone, and then the Fermi surface is 
calculated as a function of the wave number vector by the GFM. The wave 
function is also calculated by the GFM. The perturbation method is used to 
obtain the wave function outside the inscribed sphere. 

Besides the GFM the APW method is also used to calculate the energy 
values and the wave functions of the superlattice CoFe at points of high 
symmetry. The APW method is quite suitable for the calculation of the wave 
function outside the inscribed sphere. 
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III. The Determination of the Potential 

The periodic potential which we use for determination of energy bands 
must be in some sense a self-consistent potential. The construction of the 
Hartree-Fock potential, however, is out of question owing to computational 
difficulties. Further, it is not at all obvious that the HF potential is the best 
effective potential, because the correlation effect may be very important in 
the study of d-electrons in transition metals. Therefore, we must have some 
simple procedure by which the crystal potential is determined from the 
knowledge of the Bloch functions below the Fermi surface. Here, such poten¬ 
tial will be called “ effective potential.” In general, the effective potential will 
depend upon the wave number vector as well as upon the quantum number 
/, but it is much easier to choose an effective potential—that is a function of 
position only. At the present stage of development of the theory, the effective 
potential is justified only when it leads to results in agreement with experi¬ 
ment, so that the band theory is not completely free from the phenomeno¬ 
logical character. 

As the first approximation the crystal potential is assumed to be 

F 0 (metal) = F(free ion) + F(single OPW), 

where F(free ion) is the Chodorow potential for the 3d HF function of the 
free (metal) + ion, and F(single OPW) is the potential due to conduction 
electrons whose wave function is assumed to be the single OPW function. In 
the next approximation the potential is assumed to be given by 

F(metal) = F(free ion) + F c (metal) - F c (atom) + F ejc (metal) — F ejc (atom). 

Here, F c (atom) is the Coulomb potential due to the atomic 3d electrons, and 
F ex (atom) is the exchange potential in the configuration of the free atom, which 
is calculated by the Hartree-Fock-Slater free-electron approximation. The 
Coulomb potential, F c (metal), is determined from the charge distribution of 
all electrons in the 3d and the conduction bands. The potential due to the 
conduction electron is evaluated from the knowledge of the wave function 
at the energy of ^Tj) +\E F and a number of the conduction electrons. 
The charge distribution of d-like electrons is calculated at seven energy values 
in the 3d bands. After the 3d wave function is evaluated, and is averaged 
through angle variables and is normalized in the Wigner-Seitz cell, the charge 
distribution is determined as a function of position for each of seven energy 
parameters. By multiplying a weight, which is obtained from the density-of- 
states of the d-bands, to each of seven functions of the charge distribution, 
and summing them up, we obtain the total charge distribution of the 3d 
electrons. The exchange potential of the metal, F ejc (metal), is evaluated again 
by the method of HFS free electron approximation. Since the crystal potential 
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and the wave functions are thus interrelated, they must be determined self- 
consistently. 

The energy band of copper is computed to test the potential mentioned 
previously. The results are in good agreement with those obtained by Segall 
with the Chodorow potential. For the purpose of illustration, the relative 
energy values obtained by the self-consistent potential are listed in Table 1, 
together with those obtained in earlier calculations (Wakoh, 1965). 

TABLE 1 

Some Band Energies 11 of Copper Obtained by Various Potentials' 


Potential 

T 2 y~T l 

Xs-T, 

v 5 -Vi 

Segall 

0.331 

0.470 

0.300 

Chodorow 

0.399 

0.516 

0.249 

Mattheiss 

0.468 

0.572 

0.250 

Present 

0.386 

0.499 

0.245 


a In rydberg units. 


IV. Results 

A. Ferromagnetic iron 

In a ferromagnetic state the up-spin electrons and the down-spin electrons 
move in the different self-consistent potential. The difference in the potential 
is taken into account through the exchange potential: 

A{F ex (metal) - F ex (atom)}, 

where X is an adjustable parameter, which is determined so as to make the 
number of the down-spin electrons 5.1, and that of the up-spin electrons 2.9 
per atom. As the result of calculation the value of X is determined as 0.5. 
The E(k) curves thus obtained are illustrated in Fig. 1. The Fermi energy is 
calculated as —0.142 Ry, and the temperature coefficient of the electronic 
specific heat, y, is evaluated as 9.4 x 1CT 4 cal/mole deg 2 , while the observed 
value is 12 x 10 -4 cal/mole deg 2 . The bandwidth {H iy — H 12 ) is also evalu¬ 
ated as 0.395 Ry for the up-spin band, and 0.352 Ry for the down-spin band. 
The exchange splitting energy is not a constant, but depends on the state, for 
example, 1.78 eV at T 12 and 0.46 eV at T v 
Next, let us discuss the shape of the Fermi surface derived from the magneto¬ 
resistance observed by Reed and Fawcett (1964, 1965). Experimental data 
are consistent with the behavior of a compensated metal. From the band 
structure obtained here, it can be seen easily, that the numbers of electrons 
on the electron sheets and holes on the hole sheets are equal. The result of the 
observation is illustrated in Fig. 2. On the other hand, the Fermi surface 






- 0.8 - 

h r p h n r 

Fig. 1. The calculated bands for ferromagnetic iron along the various symmetric axes 
in the Brillouin zone. Solid curves represent the spin-up bands and dashed curves represent 
the spin-down bands. £> denotes the Fermi energy. 


001 



Fig. 2. (a) Stereogram showing the position of the minimum in the magnetoresistance 

rotation curves for an iron whisker. The current direction J is indicated by a double square, 
the minima near the dashed-line (100) planes by Q and □ (the latter corresponding to the 
back of the stereogram), the minima near the continuous-line (110) planes by X, and the 
shallow minima in non-symmetry direction by #. (After Reed and Fawcett, 1965.) 

(b) Stereogram which corresponds to Fig. 2(a), obtained by the theory. The full line 
shows the position of the minima due to the open orbit in the [110] direction, the heavy 
broken line due to that in the [001] direction on the hole-like surface, the light broken 
line due to that in the [001] direction on the electron-like surface, and the dotted line due 
to that in the [130] direction. It corresponds to the black circle in (a). 
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obtained by the present calculation is shown in Fig. 3 (the up-spin band), 
and in Fig. 4 (the down-spin band). As seen from Fig. 4, there appear an 
electron Fermi surface around the T point and two small hole pockets around 
the H point. The holelike multiply connected surface exists in the [1, 1,0] 



Fig. 3. Intersection of (110) and (100) up-spin planes with the Fermi energy for 
ferromagnetic iron. 


H 


N 



Fig. 4. Intersection of (110) and (100) down-spin planes with the Fermi energy for 
ferromagnetic iron. 

direction, which connects the H points. Therefore, the open orbit is observ¬ 
able, whenever the magnetic field exists in the (1, 1,0} plane. [It corresponds 
to the full line in Fig. 2(b).] According to the theory, the open orbit in the 
[0, 0, 1 ] direction is observable, only when the magnetic field exists in the 
(0, 0, 1} plane within an angle of 8° from the [1, 0, 0] axis. [It is shown by 
the heavy broken line in Fig. 2(b).] According to the experiment, however, the 
open orbit exists when the angle is beyond 8°. This may be interpreted by the 
occurrence of the magnetic breakdown phenomenon at the A 5 axis on 
the Fermi surface of the up-spin bands (Fig. 3). Further, the open orbit in 
the [1, 3, 0] direction seems to be observed by Fawcett and Reed. 
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The de Haas-van Alphen studies were carried out by Anderson and Gold 
(1963). They observed the oscillations of the period (2.16 ± 0.15) x 10“ 7 G" 1 
in a [100] whisker, and the period (2.71 ±0.16) x 10 -7 G -1 in a [110] 
whisker. The observed period will be ascribed to the lense in the up- 
spin Fermi surface. The period is anisotropic, and it is estimated to be 
2-5 x 10 7 G _1 , but it is rather sensitive to the choice of the potential. The 
effective mass is also determined as 0.25 m 0 by the experiment. The corres¬ 
ponding value is estimated as 0.26 m 0 by the detailed calculation of the energy 
values near the lense surface. 

The wave function of both spin states are evaluated. The wave function of 
the down-spin electrons is a little contracted as compared with that of the 
up-spin electrons because of the difference in the exchange potential. As a 
result, it is expected that the direction of the spin polarization is reversed at 
the boundary region of the Wigner-Seitz cell, where the density of the up-spin 
electrons is beyond the density of the down-spin electrons in spite of the 
minority of the former. Of course, the polarization of s-electrons must be 
considered in order to discuss the experimental result. 


B. Ferromagnetic nickel 

The band structure of both paramagnetic and ferromagnetic nickel was 
worked out by the tightbinding approximation (Flecher and Wohlfarth, 1951), 
by the APW method (Hanus, 1962; Mattheiss, 1964), and by the GFM 
(Yamashita et al ., 1963; Wakoh and Yamashita, 1964). From these works it 
becomes apparent that the band theory is able to produce the main observed 
character of the Fermi surface in ferromagnetic nickel. The self-consistent 
potential mentioned previously is also applicable to this problem. If we assume 
that A = 1 , the self-consistent potential gives the following results (Wakoh, 
1965). The number of up-spin electrons is estimated as 4.67 and that of down- 
spin electrons as 5.33, so that the Bohr magneton number is evaluated as 
0.66, while the experimental value is 0.6. The s-band of down-spin electrons 
has a neck around the L point. The calculated value of the radius of the neck 
is about 1.25 times of the observed value. The density-of-states curve has a 
high peak near the Fermi energy. The value of y of the electronic specific heat 
is estimated from the theory as 14 x 10 -4 cal/mole deg 2 , while the observed 
value is 17 x 10 -4 cal/mole deg 2 . The mixing ratio of the y-components and 
the e-components of the unbalanced d-electron numbers is determined by the 
theory as 27.4% and 72.6%, respectively. The corresponding experimental 
values are 27% and 73%, respectively. In the case of iron, we have also 
obtained good agreement; that is, the theoretical values are 53.7% and 
46.3% respectively, while the experimental values are 53% and 47%, 
respectively. 
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C. Antiferromagnetic chromium 

It is not at all obvious that the one-electron band theory is able to explain 
why chromium exhibits metallic antiferromagnetism. It is expected, however, 
that many experimental facts on antiferromagnetic chromium are derived 
from the band theory, if once the antiferromagnetic state is assumed to be 
realized in Cr. Some results of a preliminary calculation follow. 

First, we consider a metallic chromium in a complete antiferromagnetic 
state. Then, the problem is to calculate the band structure of a CsCl-type 
superlattice, where spin-dependence of the potential is taken into account 
self-consistently by the HFS free-electron approximation (Slater and Koster, 
1954). Here, the number difference between the up-spin electrons and the 
down-spin electron on each atom is assumed to be 0.6. Then, the spin depen¬ 
dent potential is given by 

F r (Crt)= fj(Crl)= V sc - AV ex 

and 

K t (Cri)= ^(Cr t) = V sc + AV ex , 

where F T means the potential for up-spin electrons, V l for down-spin elec¬ 
trons, Cr| means the lattice site which is normally occupied by up-spin 
electrons and CrJ, the lattice site normally occupied by down-spin electrons, 
and V sc is the potential in the paramagnetic state, which is determined self- 
consistently by the method mentioned in Section III. The exchange potential 
AV ex is expressed as: 

AV ex = X x 6{(3p/47i) 1/3 - (3p 0 /4^) 1/3 } 

Here, 

P — Po + 

where 2p 0 is the total charge density, and dp is calculated from the d-wave 
functions of the states being situated in a part of the &-space just below the 
Fermi surface, which contains 0.3 electrons per atom. The two values, 0.5 
and 1, are assigned to the parameter X. The E(k) curves thus obtained are 
illustrated in Fig. 5. As seen from the figure, the energy gap appears near the 
Fermi surface on the S and A axes. It is important to note that this gap is 
widely extended in the £-space and appears mostly near the Fermi surface. 
In fact, there is an electron Fermi surface around T and a hole Fermi surface 
around H in the paramagnetic chromium. (The Fermi surface is almost 
identical to that illustrated in Fig. 3.) In the antiferromagnetic chromium, 
however, two surfaces are overlapped in the Brillouin zone of a simple cubic 
type, so as to bring about the energy gap. It is a remarkable fact that the energy 
gap appears on almost whole areas where the electron surface around T and 
the hole surface around //contact each other. Only the small regions around 
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the A axes are exceptional. Let us look at the lower boundary surface of the 
energy gap that separates the allowed region from the forbidden one. The 
energy value at a point on this surface is a function of the direction of a k- 
vector. There are local maximums on this surface, and the holes exist near 
these maximum points. It is another remarkable fact that the energy values of 
all local maximum points are nearly equal, that is, the difference is at most 
0.005 Ry. 

By comparing the band energy of the antiferromagnetic state E A (k ) with 
that of the paramagnetic state, E P (k ) we see that the following relations are 
satisfied: 


E A (k) < E P (k), when E P (k)<E F , 

and 

E A (k) > E P {k), when E P {k) > E p 

with a few exceptions near the Fermi surface. In general, the amount of the 
energy difference is quite small, (the order of 0.001 Ry), but it becomes larger 
near the boundary surface of the energy gap, (the order of 0.01 Ry). On the 
other hand, the width of the gap on the surface of the Brillouin zone is at 
variance. At some points, it amounts to more than 0.03 Ry, while at other 
points it is much smaller. 




- O.ORy 


—0.3Ry 


-0.6Ry 


L -0.9Ry 


Fig. 5. Energy bands in antiferromagnetic chromium. The upper and the lower dotted 
horizontal line show the Fermi energy in the paramagnetic state and the antiferromagnetic 
state, respectively (A = 0.5), 
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In order to confirm self-consistency of the exchange potential we have 
calculated the wave function at 64 points in the Brillouin zone, and estimated 
the spin density at the up-spin lattice site. The result is shown in Table 2. 
As seen from the table, the d-electrons only contribute to the spin polarization. 


TABLE 2 

The Contribution From s-, p-, and d-Components of the Wave 
Functions to the Magneton Number at the Chromium Site 



Outside 

At up-spin site 

At down-spin site 

Net 



s 

P d 

s 

P 

d 

spin 

A = 1 

1.02 

0.18 

0.18 2.56 

0.18 

0.18 

1.70 

0.86 

A = 0.5 

1.02 

0.18 

0.18 2.35 

0.18 

0.18 

1.91 

0.44 


The amount of the spin polarization is about 0.86 n B and 0.44^ B , when A = 1 
and A = 0.5, respectively. Thus, we see that the self-consistent exchange 
potential is attainable, if the value of A is assumed to be a little larger than 
0.5. It is quite important that the observed magnetic moment is derived from 
the band model with a reasonable assumption to the potential. In the anti¬ 
ferromagnetic state, the wave function is completely polarized on the boundary 
surface of the gap, and the boundary surface above the gap has the polariza¬ 
tion of an opposite direction relative to the surface below the gap. Therefore, 
the most favorable condition for the antiferromagnetic state is that the greater 
part of the Fermi surface is situated within the energy gap. As mentioned 
previously, the energy gap appears mostly near the Fermi surface, and it is 
widely extended, so that it is quite certain that the formation of the anti¬ 
ferromagnetic sublattice is favorable to the band energy. Further, existence of 
the gap near the boundary of the Brillouin zone may contribute to the energy 
gain in the antiferromagnetic state. Thus, the band theory seems to produce 
some evidences for metallic antiferromagnetism in chromium. It is not at all 
obvious, however, that the antiferromagnetic state is really a stable one, 
because the correlation energy is not considered in these discussions. 

From a point of view of the spin density wave theory (Overhauser, 1962) 
the model mentioned previously is regarded as a special case of the spin 
density wave model with the wave vectors 

Q = (2 n/a) (100). 

According to the experiment the antiferromagnetic lattice structure is slightly 
modulated by the spin density wave with a long period which is equivalent to 
the length of 20.8 unit cells. Therefore, the simple antiferromagnetic model is 
not applicable to pure chromium. Lomer (1962) was the first to apply Over- 
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hauser’s idea to chromium. He suggested that a more favorable state was 
the one in which the spin density wave with a wave vector 

Qi = (2nja)(0.95, 0, 0) 

exists, because the size of the electron Fermi surface around r is slightly 
smaller than the size of the hole Fermi surface around H, so that they are 
most effectively coupled by having the largest contact area, when the hole 
surface is displaced by It is rather hard to derive a definite number of Q x 
from the band theory, because it depends upon the relative position of the 
s- and d-bands sensitively. As is well known, this relative position is quite 
sensitive to the potential. It is almost certain, however, that the value of 
Qi lies between 0.9 and 1, according to the present calculation. At present, 
however, we are not able to make a further step. It is very hard to derive any 
other quantitative conclusions for the spin-density-wave state from the band 
theory. 

It is observed that the period of the spin density wave is quite sensitive 
to a small amount of impurities. For example, the period is decreased by 
addition of a small amount of vanadium, while it is increased by addition of a 
small amount of Mn. It is not hard for the Lomer model to give a qualitative 
explanation to those phenomena. When a small amount of Mn is added as 
an impurity, the total number of electrons is increased, so as to make the size 
of the electron Fermi surface larger, and the hole Fermi surface smaller. Thus, 
the value of Q l is increased and the difference (Q — Qf becomes smaller. 
It makes the period longer. When a certain amount of Mn is added, both the 
electron and the hole Fermi surface have the same size, and the complete 
antiferromagnetic structure is expected to be realized. The decrease of the 
period in the case of vanadium impurities is also interpretable by a similar 
argument. On the other hand, a more detailed study is necessary, in order to 
develop a quantitative theory of the impurity effect. According to experiment 
a complete antiferromagnetic state is realized, when the Mn impurity of 
about 2 at. % is added to chromium. At present, it is not certain whether the 
electron- and hole-Fermi surface have really the same size at this impurity 
content, or the spin density wave can not be stable at this impurity content 
owing to the impurity scattering. 

D. Ferromagnetic superlattice CoFe (50 atomic percent each) 

As the crystal potential we use V 0 (Fe metal) and V 0 (Co metal) in the 
respective inscribed atomic sphere. In this case, the self-consistent approach 
has not yet been carried out. The number of s-electrons is assumed to be one 
for each atom. In a ferromagnetic state all d-bands of down-spin states are 
assumed to be filled up with electrons. Then, 6n up-spin electrons occupy the 
s- and d-bands up to the Fermi energy. Here, n is the number of the atomic 
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pair in the unit volume. The band structure of the superlattice CoFe is deter¬ 
mined by the GFM at the points of high symmetry. For the purpose of com¬ 
parison, the band structures of bcc Fe and Co with the same lattice constant 
(a — 5.3873 au) are calculated. Except for the appearance of the energy gaps 
the E(k ) curves of the superlattice CoFe are quite similar to those of iron and 
cobalt in the bcc structure. The density-of-states in CoFe is found to be nearly 
equal to the simple average of those in Co and Fe. Therefore, the general 
character of the density-of-states is quite similar to that of the metallic iron 
in the bcc structure. It has two peaks, and a minimum between them. 
The Fermi surface is situated at the energy slightly larger than the min¬ 
imum point. These theoretical results are in good agreement with the gen¬ 
eral prediction given by Mott on the basis of the rigid band model (Mott, 
1964). 

It is needless to say that appearance of the energy gaps is the most important 
character of the superlattice. The general character of E{k ) curves in CoFe 
is quite similar to that in Cr mentioned previously. The Fermi surface is 
situated at the energy region between £(r i2 ) and £(r 25 .), so that the formation 
of the superlattice will reduce the band energy and the superlattice structure 
seems to be stabilized. 

The neutron diffraction experiment reveals that the iron cell has a magnetic 
moment of about 3^/ B and the cobalt cell has a moment of about 2n B . It means 
that there are about seven d-electrons in the iron cell and about eight 
d-electrons in the cobalt cell. In the band picture, it means that the wave func¬ 
tion is more concentrated to the cobalt cell at the lower part of the d-bands, 
while it is more concentrated to the iron cell at the upper part of the d-bands, 
so as to make the charge distribution more concentrated to the cobalt cell. 
Detailed investigation on the nature of the wave function in CoFe confirms 
that the prediction mentioned previously is approximately correct within 
the accuracy of the present calculation. 

Thus, we find that the average character of the compound lattice is revealed 
in the band structure and in the density-of-states, while the individual char¬ 
acter of each component is manifested in the nature of the crystal wave 
functions. These rather opposite characters are well harmonized in CoFe, 
because the crystal has a full translational symmetry in the superlattice 
structure. 

Next, let us make some speculations about the electronic structure of the 
disordered lattice of CoFe. According to the neutron diffraction experiment 
the magnetic moment of the iron cell and the cobalt cell in the disordered 
lattice are nearly equal to those in the ordered lattice. Moreover, the magnetic 
moment is practically constant through the wide range of the component 
ratio. Thus, each component in CoFe seems to retain its individual character. 
On the other hand, the electronic specific heat of the disordered 50-50 at. % 
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CoFe is again nearly equal to that of the superlattice. Further, variation of 
the electronic specific heat with component ratio is well interpreted by the 
rigid band model, that is, the density-of-states obtained by the computation 
agrees fairly well with that determined by the experiment. Therefore, the band 
theory seems to be well applicable to the disordered lattice. The individual 
character and the itinerant character, however, do not seem to be compatible 
except for the superlattice. Let us consider the lower part of the d-band, where 
the wave function has a large amplitude only at the position of Co. Since the 
cobalt is not orderly arranged, the wave function must be distorted consider¬ 
ably from the Bloch function. The wave function near the Fermi surface, 
however, may have a different character. It may have nearly equal amplitude 
at both atomic cells and it would be connected almost smoothly from a cobalt 
cell to the next iron cell, so that the density-of-states near the Fermi surface 
is expected to be given by the density-of-states obtained from the average 
potential model. 
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I. Introduction 

As pointed out by London (1935, 1950), superfluid He II and supercon¬ 
ductors are quantum systems on a macroscopic scale. In recent years, many 
beautiful experiments have been done which illustrate in striking ways the 
quantum aspects. These include demonstrations of quantization of circulation 
in He II and of flux in superconductors. Remarkable quantum interference 
effects in superfluid flow analogous to single and double-slit diffraction in 
optics have been observed with use of Josephson tunnel junctions. While the 
original descriptions of superfluids were phenomenological, it is now possible 
to see how the characteristic properties may be derived from basic microscopic 
theory and to see more clearly what are the essential features. As we shall see, 
an important concept is a macroscopic condensate wave function with ampli¬ 
tude and phase. 

In order to keep the mathematics as simple as possible, we shall for the most 
part suppose that space variations are sufficiently slow so that a local theory 
can be used. This should be true under almost all conditions in He II, but 
in superconductors, except near the transition temperature T c , a nonlocal 
theory is generally required. Even though our local theory may not give a 
correct quantitative description of superconductors, it does give a satisfactory 
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qualitative picture. We shall not be concerned very much with electromagnetic 
properties of superconductors, so that most of what we have to say will be 
equally valid for superfluid He. Further, we shall be concerned mainly with 
quasistatic flow and assume that time variations are sufficiently slow so that 
quasiparticle excitations are in local equilibrium. 

A basic property of a superfluid is macroscopic occupation of a given 
quantum state. In He II it is the momentum state of the Bose condensate; in 
superconductors, where the electrons are fermions, it is the common momen¬ 
tum of the ground-state pairs. In He II at rest, even in the presence of inter¬ 
actions between the particles, a finite fraction of the atoms are in the momentum 
state p s = 0. The momentum distribution of the ground-state wave function 
has a <5-function peak at p = 0, estimated (McMillan, 1965) to contain about 
11 % of the particles at T= 0, the fraction decreasing to zero at the A-point. 
Although there is as yet no adequate theory of the A-transition, it is generally 
assumed that it corresponds with onset of macroscopic occupation of the 
ground state. If there is flow in the ground state, the state of macroscopic 
occupation, p = p s , is different from zero. The flow may vary slowly from point 
to point, so that p s may be a function of position. Often the ground-state flow is 
specified by the velocity, v s = p 5 /m, where mis the actual mass of a helium atom. 

The ground-state wave function of a superconductor with no current flow 
is made up of configurations in which the electron states are occupied in pairs 
of opposite spin and momentum (pj, — p|) such that if one of the pair is 
occupied in any configuration, the other is also (Bardeen et al, 1957). When 
there is persistent current flow, the pairs (p + ip s |> — P + iPsl) eac h have 
exactly the same momentum p s . Thus p s = 2 m\ s defines the common momentum 
per pair in the ground state and v s the velocity of flow. The definition of \ s is 
somewhat arbitrary. For our purpose it is most convenient to take for m the 
true mass, not an effective mass for the Bloch electrons. 

In the two-fluid model (Tisza, 1938; Landau, 1941), the density of super¬ 
fluid matter flow is j 5 = p s \ s . This is the equilibrium flow associated with a given 
small velocity v 5 in the ground state. If starting from rest the whole system were 
displaced by velocity v s , the matter flow would be p\ s , where p is the total 
density. If we now let the excitations come into local equilibrium, keeping v s 
fixed, the flow will decrease. For a normal system, the local equilibrium would 
correspond to j s = p s = 0. However, for a superfluid a net flow p s v s remains; 
this defines the superfluid density p s . The normal component of flow is that 
associated with a nonequilibrium distribution of excitations of the system. In 
He II, such excitations can come into equilibrium with each other; the equi¬ 
librium may correspond to that which would existin a reference frame moving 
with velocity v„. The total flow relative to the rest frame is then j = p n \ n + p s \ s . 
In a superconductor, the excitations generally come into equilibrium with the 
lattice rather than with each other, so that v„ = 0 in the absence of applied fields. 
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A macroscopic variable important for describing superfluid flow is the 
phase *(r) of the superconducting wave function. It has the property that if 
X( r ) varies slowly in space, p s (r) = ft V%(r). The concept of an effective wave 
function was introduced in some of the early phenomenological or semi- 
phenomenological theories of superfluid flow in helium and in supercon¬ 
ductors. Its importance has received added emphasis in connection with 
Josephson tunneling in superconductors and related experiments in He II. 

Some recent discussions of the theory from this point of view are those of 
Anderson (1964) and of Josephson (1964, 1966) for superconductors and of 
Anderson (1966), Martin (1965), and Hohenberg and Martin (1964) for 
superfluid helium. The author [see Bardeen (1965)] has given a brief intro¬ 
duction to the subject. A symposium on quantum fluids was held at the Uni¬ 
versity of Sussex in 1965. The proceedings (D. F. Brewer, editor, 1966) contain 
both review articles and short accounts of original research. 

The phase may be defined in terms of microscopic theory by use of an 
appropriate Green’s function formalism. However, if the space variations are 
sufficiently slow so that a local theory is appropriate, simple quite general 
considerations can be used. Following a brief review of some of the earlier 
theories, we give a general formulation valid in the local limit. Basic definitions 
in terms of Green’s functions follow. We conclude by discussing several 
examples that illustrate the use of the phase in describing quantum aspects of 
superfluids on a macroscopic scale. 


II. Phase and Condensate Wave Functions 

Superfluids have the features of a single macroscopic quantum state ex¬ 
tending throughout the entire body. This is true even at finite temperatures 
in the presence of thermal excitations and also when there is impurity scat¬ 
tering. Several approaches have been used to define a condensate wave function 
to describe superfluid flow. 

Ginzburg and Landau (1950) introduced on phenomenological grounds the 
concept of an effective wave function x f'(r) = a(r)e , * <r) to describe superfluid 
flow in superconductors. The amplitude and phase are related to p s and p s : 

a 2 = Ps/P > (1) 


ih(ij/* Vip-ij/ Vi/'*) 
2 \l/*\j/ 


( 2 ) 


The latter gives 


p s = fiV%. 


( 3 ) 
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As so defined, p s is the canonical momentum; in the presence of a vector 
potential A(r) the kinetic momentum is 

m*y s = p.s — (e*/c) A = ftVy — (e*/c) A. (4) 

In the modern version of the theory (Gor’kov, 1959) m* — 2m and e* =2e 
are the mass and charge of a pair. When v s is replaced by the corresponding 
density of flow p s \ s , Eq. (4) is essentially the first London equation that accounts 
for the Meissner effect. 

The phase x plays the same sort of a role in determining superfluid flow in 
superconductors that the voltage does in ordinary ohmic flow in normal 
metals. If there is no supercurrent flow, x must be the same everywhere in a 
superconductor, even over “miles of dirty lead wire,” just as the voltage is 
the same everywhere in a normal metal in the absence of a current. The long- 
range coherence of phase is intimately connected with the superfluid properties. 

Not long after Ginzburg and Landau proposed their theory of super¬ 
conductivity, Penrose and Onsager (1956) gave a more microscopic approach 
to define an effective wave function for He II. They suggested that the density 
matrix of the superfluid is of the form: 

P(T r')= j'¥*(r,r 2 ,r 3 , ...,r N )'¥(r',r 2 ,r 3 , ...,r N ) dx 2 ••• dx N 

= 'P*(r)ff / (r')-(-incoherent terms. (5) 

The incoherent terms vanish for r —r' large. The first term is the coherent 
contribution which remains finite as r — r'-»oo; it corresponds to what Yang 
(1962) calls off-diagonal long-range order (ODLRO). If « 0 is the density of 
particles in the condensate, 


•A(r) = [«o(r)] 1/2 e f;:(r) . (6) 

As we shall discuss in Section VI, a two-particle density matrix is required for 
superconductors. 

The phase was introduced by London (1950) and by Feynman (1955) by 
considering the displacement in momentum space of the many-particle wave 
function x P 0 (r 1 ,r 2 , ... ,r N ) describing the state with no current flow. If the 
momentum p s (r) = fiVx(r) varies slowly in space, the appropriate function is 

y = exp p £ x(r y )j T 0 ( ri , r 2 , ..., r N ). (7) 

The quasiparticle excitations of the system may take up a new equilibrium 
after the displacement. If so, the wave function will be modified from Eq. (7) 
and the current flow will decrease from pv s to p s \ s as discussed above. How¬ 
ever, the momentum of the ground-state particles or pairs is still given by 

p s = hVx- 
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III. General Statistical Relations 

To determine the average of any quantity Q over a system in thermal 
equilibrium we average over a grand canonical ensemble: 

Y.(,«,P,\Qe- KX -’ n W,P,'> 


<Q>„ = 




( 8 ) 


where is the Hamiltonian, N is the number operator, and a represents a 
state of the system. In a homogeneous superfluid, one must specify in addition 
the momentum p s of macroscopic occupation of the ground state and restrict 
the sum to states with a given p s . Specifying p s represents a “broken symmetry” 
similar to specifying the direction of magnetization in a ferromagnet. The 
matter flow is obtained by taking Q = p j , where p, is the momentum 
operator for particle j. For p s = m*\ s sufficiently small, the flow j s is propor¬ 
tional to v s ; the coefficient of proportionality defines p s : 


h = <1 P j > a v = P S V S 


( 9 ) 


This definition of p s is a general one, not dependent on a quasiparticle model 
for the excitation spectrum. As T increases from 0 to T c (or T x ), p s decreases 
from the total density p to zero. 

A superfluid is characterized by a value of p s different from zero; in a 
normal system, p s = 0. Suppose that the equilibrium current is initially zero 
in some rest frame. Then displace the entire system in velocity space by v, 
giving a flow pv. In a normal system equilibrium is reestablished by scattering 
of quasiparticles and the flow drops to zero. The totality of states summed 
over to get the new equilibrium current is exactly the same as the initial set. 
This is not true of a superfluid where a unique frame corresponding to that of 
macroscopic occupation exists. 

These considerations account for the persistence of currents in superfluids. 
Scattering of quasiparticles does not change the common momentum p s of 
pairs in the ground state. Only a force which acts on all or a large fraction of 
the particles can do so. Superfluid flow is the equilibrium flow corresponding 
to the given p s . 

That it is p s (or v s ) which remains unchanged is shown very strikingly in 
experiments of Reppy and Depatie (1964) on persistent current flow in He II. 
They initiate circulation in the helium by first rotating the container above 
the critical velocity and then bringing it to rest. In their first experiments, the 
amount of angular momentum in the resulting persistent current flow was deter¬ 
mined by releasing the container so that it is free to rotate and then destroying 
the persistent current by a heat pulse. The angular momentum initially in the 
liquid is then shared with the container, so that the container starts to rotate 
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Fig. 1. Observed angular momentum of persistent currents in He II as a function of 
termperature to which container is cooled. Squares indicate values obtained when the 
container was cooled directly to the temperature of measurement and circles when it was 
first cooled to a higher temperature and then cooled further while held at rest. The solid 
curve is proportional to pjp. (After Reppy and Depatie, 1964.) 

with an angular velocity co. If 1 is the total moment of inertia of the container 
and liquid, the angular momentum, initially in the persistent current of the 
liquid, is L = Ia>. The observed values of L are independent of the time the 
container is held at rest, even for periods as long as a day. They find that L 
varies with temperature in the same way as pjp (see Fig. 1). If the temperature 
is just below T x , so that p s is small, L is correspondingly small. In later experi¬ 
ments (Reppy, 1965) the angular momentum of the persistent current was 
measured directly by gyroscopic effects. This nondestructive method permits 
repeated measurements of the same circulation pattern as the temperature is 
varied. In this way, measurements could be made very close to the A point 
(Clow and Reppy, 1966). 

If one stops the container at a temperature T x just below T x and then cools 
the system to a lower temperature T 2 while it is being held at rest, the measured 
angular momentum is about the same as it would be if the container were 
initially brought to rest at T 2 . Thus it is v s and not the angular momentum 
of the fluid that remains fixed as the container is cooled. The angular momen¬ 
tum increases in proportion to p s as the temperature is lowered from T v to T 2 . 
Since the container is fixed, the fluid can take up angular momentum from the 
walls. Somewhat similar experiments have been done with persistent currents 
in thin superconducting films. 

The linear relation, j s = p s v s is generally valid for He II, but is valid for 
superconductors only when v s is sufficiently small. Further, in the presence of a 
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Fig. 2. Supercurrent density j s and free energy F s versus kinetic momentum P 
m*\ s = p s — (e*/c)A (schematic). 


magnetic field defined by a vector potential A, the free energy and current 
densities are functions of m*v s as defined by Eq. (4) rather than alone. More 
generally, after the initial linear increase, j s (v s ) goes over a maximum with 
increasing v s and then decreases to zero when \ s reaches a critical value v c , as 
shown schematically in Fig. 2. Also shown in Fig. 2 is the way the free energy 
F s (y s ) depends on v s . The flow density is given by the derivative of the free 
energy: 

j 5 (v 5 ) - dFJ\ s )/d \ s . (10) 

At the critical velocity, the free energy in the superconducting state is equal 
to that of the normal state: 


F s (y c ) = F n . (li) 

Note that the change in F s between \ s = 0 and \ s = v c is equal to the free-energy 
difference between normal and superconducting states in the absence of 
current flow: 


F s O c ) - F s ( 0) = F n - F s ( 0) = ,^/Sn. (12) 

Presumably similar relations hold for He II, but practically it is not possible 
to reach velocities (or values of v s — v„) sufficiently large to get appreciable 
depletion of the ground state and thus deviations from the linear relation. 

As an example, let us consider the case of an ideal superconductor at T=0 
in the quasiparticle approximation (Rogers, 1960; Bardeen, 1962). In the 
frame for which p,, = 0, the quasiparticle energiesaregiven byE(j>) = (e 2 p + A 2 ) 1/2 , 
where e p is the energy of the corresponding state in the normal metal. If the 
system is displaced so as to move with a velocity v s , the quasiparticle energy 
becomes F(p) + p • v s . If \ s is sufficiently large it is favorable to form a pair of 
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excitations corresponding to displacing a particle from one side of the Fermi 
surface to the opposite. The energy required is 


(13) 


2(A -p F v s ), 


which becomes negative for v s > A/p F . This corresponds to the Landau criterion, 
p • v s > E(p) for which the displaced ground state becomes unstable against 
formation of quasiparticles. When v s becomes larger than A/p F , depairing sets 
in rapidly and the gap parameter A decreases. The current density reaches a 
maximum for v s =l.03 Ajp F and goes to zero for v s = v c = 1.359 A \p F . The 
region between v s = A \p F and v s = v c is one of gapless superconductivity in that 
pairs of excitations can be formed with no expenditure of energy, thus absorp¬ 
tion can take place at arbitrarily low frequency. 

In general the Landau criterion p • v s > E(j>) does not give the critical velocity 
for destruction of the superfluid state, but is that for the onset of gapless 
superconductivity (or superfluidity). The superfluid state persists until a some¬ 
what higher critical velocity is reached corresponding to A-+0 or, in the case 
of He II, depletion of the ground state, n 0 ~* 0. 

While there is an equilibrium mass flow when p s is different from zero, there 
is no corresponding heat flow. The heat flow is defined by 


)h Je jpAb 


(14) 


where j £ is the energy flow, j p = jjm is the particle flow and p is the chemical 
potential. In the reference frame in which the excitations come to equilibrium 
there is no heat flow, j A = 0. Another way of expressing this result is that the 
entropy flows with the normal component of the two-fluid model. While the 
absence of heat flow in the equilibrium frame follows from general considera¬ 
tions, it is difficult to give an explicit general proof from microscopic theory. 
Most proofs that have been given make use of the quasiparticle approximation. 
Ambegaokar and Rickayzen (1966) have derived general expressions for the 
accelerated currents of energy and matter induced in a superconductor by a 
long wavelength electric field and have shown that the induced heat current 
is indeed equal to zero. To prove this result it is necessary to introduce some 
sort of scattering in order to bring about an equilibrium distribution. 


IV. Quantization of Circulation and Vortex Lines 


Quantization of circulation follows from the expression relating the ground- 
state momentum p s with the gradient of the phase. The relation p s = h grad x 
implies curl p s = 0, or potential flow. From the fact that % must be single 
valued it follows that 



(15) 
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Fig. 3. Contour about a hole in a multiply connected superfluid. 

where the integration is around a closed contour and n is an integer. In liquid 
He II this implies quantization of circulation, as first observed by Vinen 
(1961), and in superconductors of flux, as we shall discuss below. Vinen, by 
an ingenious method, measured the circulation of He II about the axis of a 
cylindrical tube after it was cooled below the X point. He generally found one 
unit of circulation, and was able to measure h/m with an accuracy of about 3 %. 

Consider, as illustrated in Fig. 3, a contour around a hole in a multiply 
connected superconductor through which flux is threading. Suppose that the 
contour is outside of the penetration depth of the flux, so that everywhere 
along the contour v s = 0. From Eq. (4), this implies that \> s = e*/cA. Thus 
making use of the quantization condition (London, 1950): 

O = flux enclosed = <j> A* dl = (c/e*) j> p s * d\ = nhc/e*. (16) 

With e* = 2e, the flux unit is hc/2e = 2 x 10“ 7 G-cm 2 , half the value predicted 
by London, who took e* = e. The first measurements of flux quantization in 
small superconducting cylinders by Deaver and Fairbank (1961) and by Doll 
and Nabauer (1961) as well as subsequent experiments give values corre¬ 
sponding to e* = 2e. They measured the flux frozen in small cylindrical tubes 
with superconducting walls when cooled into the superconducting state in 
the presence of an external field //parallel to the axis of the tube. If there were 
no quantization effects, one would expect to find <£> = aH, where a is the area 
of the inside of the cylinder (including the penetration region of the super¬ 
conductor). However, the measurements indicate that <I> remains zero until H 
reaches a value greater than half that required for a single flux quantum, 
d) 0 = hc/2e, when <J> jumps to a full quantum. With further increase in H, 
remains constant until H reaches a value corresponding to |d> 0 , whenjumps 
to 2<J> 0 , etc. 

The value e* = 2e was subsequently explained (Byers and Yang, 1961) in 
terms of the pairing of electrons in the ground state. From the viewpoint of 
London, there is not just one, but two independent ground states, one of 
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which, 4V yields even multiples, the other 4 / 1/2 odd multiples of /?c/2e. Effects 
of pairing may be taken into account by taking for p s in Eq. (15) the momentum 
of a pair, as we have done. 

As pointed out by Onsager (1949) and by Feynman (1955) for superfluid 
helium and by Abrikosov (1957) for superconductors, it is possible to have 
circulation in simply connected systems in the form of quantized vortex lines. 
The amplitude of the effective wave function defining the motion of the super¬ 
fluid goes to zero at the core of the vortex line. The solution corresponding to 
a straight vortex line along the z-axis of cylindrical coordinates ( r , z, cp ) is of 
the form: 

i j/(r, (p) = a(r)e i,up , (17) 

where a(/')->0 for r-* 0 and n is an integer giving the number of quanta of 
circulation. It can be shown that for a given total vorticity it is energetically 
favorable to have it split up into vortex lines of single rather than multiple 
quanta, so that normally /?= 1. When He II is in rotation with an angular 
velocity co , there is a uniform density of vortex lines; the number per unit area 
is 2mwjh. Many experiments have been done which demonstrate clearly the 
presence of such quantized vortex lines in rotating helium. Abrikosov (1957) 
showed that in certain types of superconductors, called type II, it is favorable 
to have flux enter in the form of quantized vortex lines when the applied field 
exceeds a lower critical field H cl , the metal remaining superconducting until 
the field reaches an upper critical field H cl . His theory forms the basis for our 
understanding of the properties of type II superconductors. 

Measurements of quantization of circulation or of flux allow one to deter¬ 
mine hjm (for helium) or of h/e (for superconductors) by purely macroscopic 
measurements. These are very striking illustrations of quantum effects on a 
macroscopic scale. 


V. Phase as a Quantum Variable 

So far we have shown how supercurrents are related to gradients of phase, 
but we have not shown how the phase, y, itself may be defined as a quantum 
variable. As we shall see, it may be regarded as the variable conjugate to the 
particle number N , so that if one is specified, there is an uncertainty in the 
other, with an uncertainty relation ANA/ ~ 1. Let us first consider an isolated 
body with no current flow, so that y(r) is a constant independent of r. To 
introduce the phase, one must deal with states in which the number of particles 
is not specified exactly. Thus if O v is a state with exactly N particles, one 
considers linear combinations of functions of the form 

^ = X a N®N ’ 

N 


(18) 
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where the <$> N differ only in the number of particles in the ground state. Such 
functions were introduced for mathematical convenience by Cooper, Schrieffer, 
and the author (see Bardeen et a/., 1957) in their first papers on the theory of 
superconductivity. It has turned out that use of such functions is not only 
convenient for mathematical reasons, but also makes good physical sense, in 
that it allows one to define the phase. There are no matrix elements of the 
Hamiltonian connecting states with different N, so that if the a N are sharply 
peaked about a given N, calculations made with 'P are substantially equal to 
those made with <f> N . In superconductors, only even numbers of particles need 
occur and N may be regarded as the number of pairs. As pointed out by 
Anderson (1958, 1964, 1966), Josephson (1962, 1964), Ferrell and Prange 
(1963), and others, a state with a precise phase x may be defined as the function 

^ = I*'** <V (19) 

N 

Thus N may be regarded as the operator — id/dx conjugate to N. Normally N 
is a very large number, so that x may be specified with little uncertainty even 
though A N/N <£ 1. 

Because of the uncertainty relation AN Ax ~ 1, a measurement of x requires 
an experiment in which the particle number N can change in an unspecified 
way. For a superconductor, this means that it must be possible for electrons 
to flow in or out of the body during the measurement, so that it cannot be 
electrically isolated. 

One way the phase difference between two superconductors can be measured 
is by Josephson tunneling. Prior to experimental verification, Josephson (1962) 
made the remarkable prediction that a supercurrent can flow between two 
superconductors separated by a thin insulating barrier through which electrons 
can tunnel. He showed that for a junction of area sufficiently small so that 
magnetic effects are unimportant, the supercurrent flow is proportional to the 
sine of the phase difference between the two metals: 

J=Ji sin(xi -Xi), (20) 

where J x is the maximum current density which occurs for X\ ~X 2 = n/2. Many 
of the most beautiful experiments illustrating the quantum aspects of super¬ 
fluids make use of Josephson tunneling. 

Anderson (1962) has pointed out that the Josephson current is related to 
the coupling energy between the two superconductors that arises from the 
possibility that ground-state pairs can tunnel back and forth between them. 
In the coupled system, the numbers of electrons on each side of the barrier 
is not specified exactly. The free energy associated with this coupling is: 
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where J l is now an electrical current rather than a mass flow. Thus 

2g d_w 12 
h d(xi ~ Xi)' 


( 22 ) 


This is the analog of the Eq. (10) for flow due to a gradient in phase in a bulk 
superfluid. 

Let us now consider a general formulation of superfluid flow in which the 
phase x(r) and the number density «(r) are considered to be canonical conjugate 
field variables both of which can be specified with reasonable accuracy. The 
free-energy density F(p s , n ) is assumed to depend on p s (r) = fcVx and on «(r), 
implying a local theory. The total free energy may be taken as the Hamiltonian 
for the system. 

= J F(hWx, n( r)) dr. (23) 

The Hamiltonian equations of motion with hx and n as conjugate variables 
are: 


dn 

dt 



= -divj p , 


(24) 


where j p is the particle current density, and 


G_X 

dt dn 


(25) 


where /.i is the chemical potential. The gradient of the latter equation is 

= — grad n- (26) 

Equation (24) is the continuity equation and Eq. (26) the acceleration equation 
for the superfluid. In a superconductor, p s , n and /r all refer to pairs. 

More general equations for superconductors can be obtained by including 
magnetic fields through the vector potential. The free energy and current 
density then depend on the kinetic momentum m*v s as defined in Eq. (4), the 
equation of motion for which is 

. dv e* dA 

m T, = - grad " - 7 T, ■ (27) 


It should be emphasized that these equations apply to impure as well as pure 
superconductors, even when the mean free path is less than the coherence 
distance. Equation (27) is the second London equation; the first is that corre¬ 
sponding to Eq. (4). 
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Conservation of free energy follows from Eqs. (25) and (26). If we multiply 
(26) on the left by <3F/dp s and on the right by the equivalent j p , we get 


dF dp s 8F 

ajp s ' 17 = ¥ = ' grad 


(28) 


The right-hand side is the rate at which work is done on the system. 


VI. Definitions in Terms of Green’s Functions 


The condensate wave function can be defined more precisely in terms of 
exact states by use of many particle functions with varying numbers of par¬ 
ticles in the ground state, as in Eq. (18). The functions refer to the entire 
system and differ only in the number of particles in the condensate; the excita¬ 
tions are assumed to be the same, at least to terms of 0(1/N). The sum for a 
given state a of the system 


1 1 

y =- V 

■ ( a Ny 2 m kj' 


ol,N + m > 


(29) 


is over a range AN=2j+l in the vicinity of <W> av , where AN is a large 
number, but such that AN/N is small. Averages over the ground state or over 
a thermal ensemble are made with use of the functions T 1 ,,. 

For the Bose system, superfluid He, the condensate wave function is defined 
by (Beliaev, 1958): 


\J/ C (r, t) = <<Kr, t)}, (30) 

where \j/( r, t) is the wave field operator, £V p ' r c p (t), in the Heisenberg repre¬ 
sentation. Typical matrix elements which contribute to the thermal average 
are of the form 


<fV — 1, a|i/((r, t)\N, a). (31) 

The wave function if/ c (r, t) has amplitude and phase, and is of the form 

•Ac(r, t) = a(r, t)e iz(r,<) (32) 

where a( r, t ) may be interpreted as [« 0 (r, t)] 1/2 as in Eq. (6). Hydrodynamical 
equations for superfluid flow can be derived from the equation of motion 
for \j/ c (r, t ); a rather complete discussion has been given by Hohenberg and 
Martin (1964). 

In a superconductor, one must deal in general with an effective wave function 
for a pair of particles. The appropriate function is the anomalous Green’s 
function introduced by Gor’kov (1958) and defined by 

e~ 2 '““F ul (r 1 , ; r, , /,) = <^„(r„ t t )i/i t (r 2 , (,)> 


(33) 
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where a and /? are opposite spin coordinates. The time factor associated with 
removing a pair of particles with Fermi energy p has been separated out. 
Averages are again over states of the form (29); in typical matrix elements the 
particle number differs by two on the two sides: 

(N - 2, m\\J/Jr 1 , r 2 , t 2 )\N, w>. (34) 

It is the two-particle rather than the one-particle density matrix which has a 
factorizable part: 

p 2 ( i - !, r 2 , r\, r' 2 ) = F(ri, r 2 )F + (ri, r 2 ) + incoherent terms. (35) 

For a homogeneous system with no current flow, F depends only on the 
difference coordinates: 


F a p(r j - r 2 , tj - t 2 ). (36) 

Slow changes with position or time can be taken into account by the additional 
variables R = -£(r 1 +r 2 ) and T=\(t x +t 2 ). The range — r 2 over which F xp 
has appreciable magnitude defines the size of the pair wave function, it is of 
the order of the Pippard coherent distance £ 0 . 

A local theory of the sort we have used earlier applies only when changes 
with R occur slowly over a coherence distance. When this is not true, the 
structure of the pair wave function becomes important and the dependence 
of F aji on r, and r 2 should include effects of the space and time variations. 
The resulting equations give time-dependent generalizations of the Ginzburg- 
Landau equations. They are difficult to apply when nonlocal effects are 
important. A uniform current flow is described by an F" of the form: 

F x p = exp(/p 5 • R) F(r, - r 2 ), (37) 

where p s is the momentum of a pair. 

Also important in the Green’s function formulation is the usual single¬ 
particle Green's function, which for a homogeneous system is defined by 

G(r, t)=- /<F^(r, t)^( 0, 0)>. (38) 

Here T represents time ordering. The Fourier transform may be expressed in 
terms of the spectral function A(p , co). As defined by Kadanoff and Baym 
(1962), G( r, t) = G > ( r, ?)forr > Oand G = G K for 1 < 0, where 


G > (p, co) = (l -/(a>))T(p, co) 
(7 < (p, co) = /(a>)/f(p, co) 


(39) 
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and where/(to) is the Fermi function 1/(1 + e p((0 ~ tl) ). One may regard G K as 
the density of particles of momentum p and energy co and G > as the density 
of holes. The general expressions for matter and heat currents are 



(40) 



(41) 


Here we have specified p 5 as the momentum state of macroscopic occupation. 
For a superfluid, j s (p 5 ) differs from zero when p s >0, but j ft should always be 
zero in the equilibrium frame. As we mentioned earlier, it seems to be difficult 
to give a general proof of the latter without making use of the quasiparticle 
approximation. 

VII. Interference Effects with Josephson Tunnel Junctions 

Following Josephson’s prediction that superfluid flow could occur between 
two superconductors separated by a thin tunneling barrier, and that the 
current flow depends on the difference in phase of the superconducting wave 
functions on the two sides, many experiments have been done to confirm the 
theory and to exhibit wave interference phenomena. We shall discuss two of 
these, one involving flow through a single junction and the other two junctions 
in parallel. What is measured is the maximum supercurrent 7 max as a function 
of magnetic field //applied parallel to the plane of the junction and transverse 
to the direction of current flow. The effect of the magnetic field is to give a 
shift in phase difference along the barrier in a direction perpendicular to the 
field. This results in oscillations in / max as a function of H as currents from 
various parts of the barrier add in and out of phase. A single barrier gives a 
pattern analogous to a single-slit diffraction pattern in optics, the two junctions 
in parallel give a pattern similar to double-slit diffraction. 

A single tunnel junction in a transverse magnetic field is illustrated schematic¬ 
ally in Fig. 4. The barrier is generally an oxide layer with a thickness t of the 
order of 10-20 A. We shall suppose that the width of the junction is sufficiently 
small so that magnetic fields produced by the tunnel current itself can be 
neglected. The applied field is assumed to be in a direction perpendicular to 
the plane of the paper; it extends into the superconductors on each side of the 
junction a distance of the order of the penetration depth A, typically ~ 5 x 10 -6 
cm. The junction extends for a width w, from x= —w/2 to jc = +w/2 with 
the center at x = 0. Let <f>« Hw(t + 2A) be the total flux penetrating the junc¬ 
tion; the flux per unit length is <l>/w. 
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Fig. 4. Schematic diagram of Josephson tunneling barrier between superconductors 1 
and 2. The vertical scale is greatly exaggerated; the thickness t of the barrier being ~ 10-20 A. 


Let us consider how the phase difference varies with x. Consider a contour 
just outside of the penetration region so that v s = 0 and p s = ftVx = (2e/c)A. 
Thus 

he Jo 

(42a) 

% 2 (*)=X2(0) + |J'a-</1. 

(42b) 

The difference may be expressed in the form 


2 e 

XiO) - Xi(x) = Xi(0) - x 2 (0) + A- dl, 

(43) 


where the contour extends between 0 and x along the dotted lines indicated 
in Fig. 4. The contour integral is just the enclosed flux xO/vv. Defining O 0 = hc/2e 
as one flux unit, we have 

XiOO - Xi(x) = Xi(0) - x 2 (0) + 27rxO/wO 0 . (44) 


The maximum supercurrent occurs for Xi(0) — X 2 (0) = rc/2. Using this value, 
and Eq. (20) for the current density, we have per unit length of junction: 



27 ix<J>\ 
M'O / 


dx = J l w 


sin(7rO/O 0 ) 

7rO/O 0 


(45) 


This expression is identical with that of the amplitude of a Fraunhofer pattern 
for optical diffraction by a slit, with O or magnetic field replacing position on 
the screen. Note that 7 max vanishes when <]> is an integral multiple of a flux 
quantum O 0 . 

Such a diffraction pattern was first observed by Rowell (1963). Figure 5 is a 
plot from data of Langenberg et al. (1966) for a Sn—Sn0 2 —Sn junction at 
1.2°K with a width of 0.25 mm. Only one sign of the magnetic-field is shown; 
the pattern is symmetric for the opposite sign. The period of the Fraunhofer 
pattern is 1.25 G. 
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Fig. 5. Maximum supercurrent versus magnetic field for a Sn—Sn0 2 —Sn junction at 
1.2°K. The plot should be symmetric for fields in the opposite direction. (After Langenberg 
et al., 1966.) 


Jaklevic and co-workers (1964,1965) have observed interference phenomena 
for a configuration of two Josephson tunnel junctions in parallel, as illustrated 
schematically in Fig. 6. In a transverse magnetic field, flux in the insulating 
region A separating the upper and lower conductors 1 and 2 gives a difference 
in phase between the junctions a and b. Oscillations in total current flow 
between 1 and 2 occur as the currents in a and b come in and out of phase 
with varying magnetic field. If <S> A is the total flux enclosed in the circuit, the 
phase difference between a and b from <J> X is 27r<I>^ l /<t> 0 . Including the diffraction 


(a) (b) 



Fig. 6. Configuration for two Josephson tunnel junctions a and b in parallel. The 
region A is insulating. The maximum supercurrent between superconductors 1 and 2 is 
observed as a function of a magnetic field applied normal to the cross section indicated. A 
very thin insulating layer C, separates the two metals at the junctions. (After Jaklevic et al., 
1964, 1965.) 
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MAGNETIC FIELD (MILLIGAUSS) 


Fig. 7. Maximum supercurrent versus magnetic field for configuration similar to that 
of Fig. 6 with junctions of Sn—Sn0 2 —Sn. For A the field periodicity is 39.5 mG, for B 16 
niG. Approximate maximum currents are 1 mA(A) and 0.5 mA(B). (After Jaklevic et al., 
1965.) 


effects at each junction as given by Eq. (45), the maximum current per unit 
length of junction is 


1 


max 


= 2 J { \V 


sin (K<fr//<Eo) 
hOj/Oq 


cos^ay^o) 


(46) 


Since the flux <S> A is much larger than the flux <X>j going through each junction, 
one observes with varying field a rapid oscillation with period given by 
superimposed on an envelope given by O,. Typical plots of 7 max versus H for 
two different specimens are shown in curves A and B of Fig. 7. 

Some experiments have been done by introducing flux in the circuit by means 
of a solenoid confined to the insulating region A so that while there is a vector 
potential there is no magnetic field within the superconductors. Similar results 
are observed. Since the junction regions are in zero or a small constant stray 
field, there is no variation of as is changed and the oscillations have 
uniform amplitude. There is, of course, an emf in the superconducting circuit 
when the field in the solenoid is changed. However, they find similar results if 
the specimen is warmed up to the normal state while the flux is changed and 
then cooled. The experiments demonstrate the importance of the vector 
potential in determining flow even when no electric or magnetic fields exist 
in the metals while in the superconducting state. 

Values of the flux quantum deduced from such experiments are found to 
be within a few percent of the theoretical values, hcjle. The main source of 
error is usually the measurement of the effective area through which the field 
penetrates. 

Zimmerman and Mercereau (1965) have rotated a circular interferometer 
with two junctions in parallel to measure hjm of the superconducting electrons. 
Rotation at an angular velocity <x> is equivalent in first order to a magnetic 
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field along the axis of rotation, H — (2mc/e)co. Oscillations in / max are observed 
as to is varied. From the period of the oscillation and the effective area of the 
circuit, they find a value for the Compton wavelength fifmc = (2.4 + 0.1) x 10“ 10 
cm, very close to the accepted value, 2.43 x 10“ 10 cm. 

VIII. The ac Josephson Effect 

Another prediction of Josephson (1962) is that if a steady voltage V is 
applied to a tunnel junction one should find an alternating supercurrent with 
an angular frequency to = 2 eVjh. The voltage implies a difference in chemical 
potential per pair on the two sides of 2 eV. Thus, from Eq. (25), the phase 
difference increases linearly with time: 

Xi (0 - Xlit) = Xi(0) - x 2 (0) - 2 e Vt/h. (47) 

The supercurrent should then vary with frequency as 

J=J 1 sinfx^O) — y 2 (0) — cat] (48) 

with to as given above. One might expect that the current would decrease 
rapidly as the frequency approaches that required to excite a pair of quasi¬ 
particles, corresponding to eV~A(T). This would limit V to the millivolt 
range and the frequencies at low temperatures to ~350 GHz for Sn and 
~650 GHz for Pb. However, Shapiro steps, as discussed below, are observed 
to much higher frequencies (e.g., up to 900 GHz). 

The microwave power generated directly in this way is small and difficult to 
observe because of the poor impedance match between the junction volume 
and the space outside. Langenberg et al. (1965, 1966) have obtained a power 
of ~10“ n W in the X band external to a Sn—Sn0 2 —Sn junction. By using 
one junction as a generator and a second junction closely coupled to the first 
as a detector, Giaever (1965) was able to observe a power in the detector 
junction of the order of 10“ 7 W. 

Another way of observing the ac Josephson current indirectly is to beat it 
against an applied microwave field. When the frequency of the Josephson 
current, 2 eV/h, is coincident with or an integral multiple of that of the applied 
field, a direct supercurrent should be observed. In experiments that have been 
done to verify this prediction, first by Shapiro (1963), a fixed microwave field 
is applied and the voltage across the junction is measured as the current 
passing through the junction is varied. Steps where voltage is constant for a 
considerable range of current are observed (Fig. 8) when V = nhvjle. Anderson 
and Dayem (1964) have done similar experiments in which the tunnel junction 
is replaced by a narrow thin-film bridge separating the two superconductors. 

An analog of the ac Josephson effect in superconductors has been observed 
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Fig. 8. Current-voltage characteristic of a Nb-oxide-Pb tunnel junction at 4.2°K with 
microwave power at a frequency of 9.75 Gc applied. Current scale (vertical) is 0.20 mA per 
large division; voltage scale (horizontal) is 0.020 mV per large division. Steps of constant 
voltage are at intervals /;v/2e = 0.0197 mV. (After Shapiro, private communication.) 

by Richards and Anderson (1965) in helium. Two baths of superfluid helium 
are separated by a small orifice which provides a weak coupling between the 
two containers. The equivalent of the microwave field is provided by an 
ultrasonic transducer which gives a sound wave field of frequency v impinging 
on the orifice. A difference z in head between the two baths gives a difference 
in chemical potential across the orifice of mgz, where g is the gravitational 
acceleration. The head difference z is measured as a function of time. Small 
steps of approximately constant head were observed at values of z = nhvjmg. 
Some subharmonics, especially half-steps, were observed as well. 

IX. Vortex Motion in Superconductors 

Quantized vortexes occur in both helium and superconductors and have 
been studied in great detail. We shall confine the discussion here to vortexes 
in superconductors and their motion, a problem important for understanding 
the properties of type II superconductors. As first predicted by Abrikosov 
(1957), in the mixed state between the lower and upper critical fields, H cl and 
H c2 , flux penetrates in the form of a closely packed array of parallel vortexes, 
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Fig. 9. Schematic plot of magnitude of the gap parameter | A(r)| versus the radial 
distance r from the center of a vortex line for nonlocal and local theories. (After Bardeen 
and Stephen, 1965.) 

each containing a single flux quantum hc/2e. Thus for B ~ 10 3 G/cm 2 , there 
are~10 3 /(2x 10 _7 ) = 5x 10 9 lines/cm 2 . 

A detailed theory requires a knowledge of the structure of a vortex and of 
the quasiparticle excitation spectrum in the vicinity of the core. The gap para¬ 
meter A(r) may be taken as the effective wave function of Eq. (17). As illus¬ 
trated by the solid line in Fig. 9, the amplitude [A(r)[ goes to zero as r-»0 
and approaches a constant value A 0 for large r. The size of the core as deter¬ 
mined by the radius corresponding to the point of inflection is of the order of 
the coherence distance £ 0 . Caroli et al. (1964) have calculated the excitation 
spectrum for a pure superconductor near T =0°K. They find that it is about 
the same as that of a model consisting of a normal core of radius a ^ £ 0 about 
which the supercurrents circulate. The region r< d is one of gapless super¬ 
conductivity. 

The nonlocal theory is difficult, requiring solutions of coupled nonlinear 
differential equations. However, a good qualitative and even semiquantitative 
description can be obtained with use of a local model of the sort we have 
discussed in previous sections. In this model it is assumed that the local super¬ 
fluid current density depends only on the local kinetic momentum, m*\ s = 
Ps — (e*/c) A. In a gauge for which A has only a (^-component A flfl), p 5 also 
has only a (^-component, given by 

P s <p(r) = h/r. (49) 

As r decreases, p sv increases until at some radius r = a, v s reaches the critical 
value v c for destruction of superconductivity. In the local model, A(r) decreases 
as one approaches the core and goes to zero at r = a, as indicated in the dotted 
line of Fig. 9. For r<a, the metal is normal. If the vector potential can be 
neglected, the core radius for a pure metal near T=0 is 

a — hl(2mv c )= 1.16£ 0 , 

about equal to that of the model suggested by Caroli et al. (1964). 


( 50 ) 
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It is possible to express the core radius in terms of the upper critical field 
H cl , which also depends on the coherence distance. Gor’kov’s (1959) expres¬ 
sion for H c2 at T= 0°K is 

H c2 =1.5hcl(4e£ 2 )^hcl2ea 2 . (51) 

This relation between H c2 and a also appears to be valid at finite temperatures 
and in the presence of impurity scattering. The only restriction is that can 
be neglected, which implies H<tH c2 . For fields near H cl the corresponding 
relation is H c2 = hcjea 2 . 

Several groups (Bardeen and Stephen, 1965; Volger et al ., 1964; van 
Vijfeijkes and Niessea, 1965; Nozieres and Vinen, 1966) have used a local 
model to discuss the theory of vortex motion in type II superconductors. 
Experimentally, what is observed is the resistivity and Hall effect in the mixed 
state when a transport current flows perpendicular to the direction of the 
magnetic field and thus to the vortex lines, as illustrated in Fig. 10. Each 
vortex is subject to a force per unit length (J T x <J> 0 )c, where J T is the transport 
current density and <I> 0 is a vector representing one quantum of flux directed 
along the vortex lines. The vortex lines thus tend to move in a direction per¬ 
pendicular to J T . Ordinarily the lines are pinned and do not move until J r 



Fig. 10. Schematic diagram illustrating flux flow in type II superconductors. Vortex 
lines are produced by a transverse magnetic field. They are driven across the slab by a 
transport current along the length. 

is sufficiently large to overcome the pinning forces. When this is the case, 
voltages are observed. The electric field in the specimen is just what one would 
calculate as being generated by the moving flux lines by induction. If \ L is 
the velocity with which they move and Bs H is the total flux density from the 
vortex lines, E= — (l/c)(v L x B). 

Under steady-state conditions, vortexes are created on one side of the 
specimen and destroyed on the opposite, so there is no net change of flux. 
For this reason, objections have been raised to calling the electric field an 
induction field, but the result is nevertheless correct (Josephson, 1965; Casimir, 
1965). A direction of \ L perpendicular to v T gives a normal resistive voltage, 
motion parallel to v r gives a Hall voltage. 
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From measurements on several specimens of varying composition and at 
various temperatures, Kim et al. (1965) found an approximate law of corre¬ 
sponding states for the flux flow resistivity pj. Their observations could be 
fitted by a law of the form 

PflPn = (H/H c2 )f (52) 

where p n is the resistivity in the normal state, B is the applied field, H cl is 
the upper critical field, and f is a factor of the order of one, actually nearly 
equal to one at T= 0°K. A Hall voltage of the same order as that in the normal 
state is also observed, as will be discussed later. Kim et al. pointed out that 
the dissipation could be accounted for approximately if in the flux flow state 
the transport current flows directly through the essentially normal cores of 
the vortex fines. Rosenblum and Cardona (1964) had earlier proposed such a 
picture in their interpretation of data on the microwave surface resistance of 
type II superconductors. 

The theory based on the local model gives just this result. Electric fields 
generated by the motion of the vortex fines lead to normal current flow in 
the vicinity of the cores and thus to dissipation and viscous drag. The motion 
of a vortex is essentially adiabatic, so that if / 0 (r) is the circulation about a 
stationary vortex, the current distribution of a moving vortex is J 0 (r — \ L t). 
Similarly, p s and are functions of r - \ L t. The electric field E (or, more 
properly, voltage gradient) is then given by 

dp Jdt = - \ L • Vp s = - V/x = <?*E, r > a (53) 

with p s = p s<? = h/r as in Eq. (49). For r < a, E is constant and perpendicular 
to v L ; the field pattern is illustrated in Fig. 11. 



Fig. 11. Electric field pattern generated by motion of a vortex line according to the 
local theory. 
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When the motion \ L is produced by the force due to a transport current, 
the field is just that required to drive the transport current through the normal 
core. The total current is 


Jt + Jo(r — V) 

where J T is in part a super and in part a normal current. The model accounts 
for the empirical law of Kim [Eq. (51)] for the flux flow resistivity. If the vortex 
were pinned, the transport current would flow around the normal core and 
there would be no dissipation. 

According to the model, the Hall angle in the mixed state should be the 
same as that of the normal state for a field equal to that within the core of the 
vortex line. When the Hall effect is taken into account, there is an angle a 
between the current density J c and the electric field E c , as illustrated in Fig. 12. 


J T 


Jt 




Fig. 12. Origin of Hall voltage in flux flow according to the local theory. When in 
motion from a force due a transport current density J T , the current flows directly through 
the normal core so that the density inside the core J c = J T . The electric field E c is at a Hall 
angle a relative to J c . Motion of the vortex is perpendicular to E c , so that v Lx /v Ly = tan a. 
(After Bardeen and Stephen, 1965.) 


The transport current still flows directly through the normal core, so that the 
current density within the core, J c , is equal to and parallel to J T . The velocity 
\ L is normal to the electric field, and so makes an angle a with the vertical 
direction, that normal to J T . The Hall angle in the mixed state given by 
tan a = \ Lx J\ Ly is therefore also equal to a. This result appears to be consistent 
with experimental measurements on very pure niobium (Reed et al., 1965), 
but not with those on alloys (Niessen et al., 1965). For the latter, the Hall 
angle increases as H drops below H c2 , a result not yet accounted for by theory. 
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X. Summary 

We have reviewed here just a few of the many experiments that have been 
done which relate to macroscopic quantum aspects of superfluids. Superfluid 
flow can be described by a condensate wave function with amplitude and phase. 
While a correct description of a superconductor is nonlocal, requiring a pair 
wave function, a good qualitative and even semiquantitative picture can be 
obtained from a local model. We have seen how the superfluid properties are 
related to the macroscopic occupation of a quantum state, from Bose con¬ 
densation in He II and from pairing in superconductors. The momentum state 
of macroscopic occupation is related to the gradient of phase of the condensate 
wave function. The phase may be regarded as a quantum field variable con¬ 
jugate to particle number. The Hamiltonian is the total free energy of the 
system and the equations of motion are the continuity equation and the 
equation giving acceleration of the superfluid. Applications of the theory to 
experiments on quantization of circulation and of flux, on various phenomena 
involving Josephson tunneling and on vortex motion in type II superconductors 
illustrate some of the aspects of macroscopic quantization that have been 
observed. 
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I. Introduction and Remarks about Professor Slater 

This article is a consequence of two aspects of Professor Slater’s existence. 
He is noted for the simplicity and soundness of his presentations of difficult 
subjects, and his coming birthday has led to an invitation for contributions. 

The material presented in this contribution has resulted as a by-product of 
an investigation of the theory of noise for active two terminal transit time 
devices, particularly those involving the negative differential mobility known 
as the Gunn effect. The connection between the circuit properties of such 
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devices and basic noise generation required the development of new theoretical 
approaches. These in turn led the authors progressively further back to some 
of the foundations of the theory of noise for steady-state nonequilibrium 
conditions. 

Since a major objective was to understand the undesirably high noise 
figures of experimental devices, the attitude of the research had a practical 
aspect. Furthermore, the expression of the results should be readily accessible 
for application by semiconductor device engineers. 

These considerations called for a presentation quite different from the 
basic reference for this subject: “ Fluctuations from the Nonequilibrium Steady 
State” by Lax (7). In this treatment Lax investigates the postulates necessary 
to arrive at various general formulas (such as our additivity theorem for 
spectral densities and the expression of spectral densities in terms of diffusion 
constants). He also surveys the literature and comments on relevant aspects of 
many contributions. In this article we have put the emphasis on exposition, 
and the interested reader is referred to Lax for references related to earlier 
work. 

Some comments on the sequence of the research may be of interest. A 
starting point was the idea that ‘‘straggle diffusion,” resulting from inter¬ 
valley scattering might be an important source of additional noise. This led 
rather quickly to the recognition of the importance of the 4 g 2 Dn = AkTc 
relationship between noise source density and Johnson noise discussed in 
VII. This noise source density was then used to derive the noise appearing at 
the device terminals. In this way the macroscopic fluctuations were expressed 
in terms of the microscopic fluctuations of the charge carriers. A very puzzling 
period of confusion occurred in this development. When the formula we now 
call the ‘‘diffusion-impedance field noise formula” was first derived, the 
impedance field was thought of only as a ‘‘localized transfer impedance.” 
What this quantity had to do with the lumped constant R for Johnson noise 
for the equilibrium case was frustratingly obscure until the “distributed 
power theorem” was discovered. This then led to the impedance field 
concept which now makes the relationship relatively obvious. 

Other conceptual difficulties involved thinking about separate disturbances 
of holes and electrons and here the method of imrefs (2,3) was found applic¬ 
able, and a reciprocity theorem for a four-terminal imref situation was 
derived. [The reader may be interested in knowing that the “ most appropriate 
authority ” (3) who reduced by a factor of three the number of syllables 
required to describe “quasi-Fermi level ” was Fermi himself, half-facetiously, 
in response to a request for suggestions.] 

The preparation of the article reflects the senior author’s training under 
Slater. One of the most helpful thinking aids has been the “ movie film ” 
visualization of the principle of detailed balance. This moyie film thinking 
tool is used in Section II as a replacement for mathematically sophisticated 
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tools such as Fourier integral theorems. In fact an attempt has been made 
to keep the mathematical requirements for the reader at the level ordinarily 
in use for elementary circuit theory analysis, diffusion equations, and 
semiconductor device theory. 

Some results representing in effect work in progress at the time the 
manuscript was transmitted are included as appendices. For these the 
level of exposition has been aimed more at the professional specialist in the 


field. 


Except for these appendices an attempt has been made to produce a contri¬ 
bution that would, hopefully, get a passing grade in a Slater course on science 
textbook writing. 


II. The Fluctuation Model 


The motions of electrons of interest in semiconductors are those in which 
they move as conduction electrons in the conduction band or as holes in the 
valence band. They may also be captured on traps and may tunnel through the 
energy gap in abrupt junctions. For all such cases whether active or passive, 
it is possible to conceive of idealized noiseless steady-state conditions having 
the same dc values as the actual conditions. The noise which we shall analyze 
arises from deviations from this ideal noiseless steady state. 

As an example of a configuration corresponding to a noiseless steady state 
we visualize a semiconductor device carrying constant average currents. 
Under these conditions we imagine that in the conducting regions the current 
is carried by electrons or holes moving monatonically without scattering with 
a fictitious microscopic velocity equal to the macroscopic drift velocity, as 
suggested in Fig. 1. We can imagine the electrons to be moving in uniformly 
spaced arrays, perhaps ideally on a flowing lattice which changes one of its 
lattice constants as the electrons proceed from regions of higher density to 
lower density. (We shall not here consider trapping or carrier generations. 
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Fig. 1. Actual motion and idealized noiseless motion. 
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For such cases electrons generated through recombination centers could 
probably be regarded as produced by each center in a perfectly regular way at 
a fixed frequency for each center with the phases of various centers of the 
same frequency uniformly distributed.) The idealized noiseless situation will 
produce negligible ac signals or noise at the device terminals. Differences 
between the behavior in an actual case and this idealized case are the fluctua¬ 
tions that we shall consider in calculating noise. 

In order to seek the simplest possible level in the presentation of this 
material we shall avoid the use of Fourier integral methods and shall instead 
make use of a fictitious temporal periodicity for our conceptual model of 
the device. This enables the treatment to proceed in terms of a Fourier series 
with periodicity T p . The physical justification is that if we arbitrarily select the 
fictitious period as being long compared to the time of measurement for some 
noise measuring instrument, then the average value read by the instrument 
will be the same whether the system repeats from t = T p to 2 T p just what it 
did from t = 0 to T p or undergoes independent random processes. 

For purposes of exposition we shall consider the movie film method used 
by Slater and assume that we have a photographic record of the motion of 
each particle in the actual circumstances. We replace the actual behavior of the 
system by taking a section of duration T p and running it repetitively. In order 
to avoid discontinuities we imagine that just before the completion of the 
desired time period T p , we look at the picture at zero time and then beginning 
at some short time before T p we deliberately introduce some forces from the 
outside so that just as the time T p is reached we have succeeded in placing 
electrons and holes in the specimen so as to duplicate precisely the conditions 
that prevailed at t = 0. The disturbance we produce by this adjustment 
operation can be made as insignificant a part of the history as we like by 
making T p longer. 

Accordingly any of the processes we consider are periodic with the important 
features displayed in equations (1) to (3): 


T p = fictitious period. 


(I) 


The phenomena can then be expanded in Fourier series with frequencies co: 


co = multiples of —. 


( 2 ) 


The allowed frequencies are evenly spaced with the minimum interval between 
frequencies being 


4/min = — — minimum frequency interval. 

1 p 
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The fluctuating quantity of chief interest in this article is the velocity of a 
carrier. To be specific we consider a hole which advantageously has a positive 
charge; we write its velocity as 

v(t)=v 0 (t)+5v(t)=v 0 (t)+u(t). ( 3 ) 

This represents a hole which is moving in the structure considered in 
approximately the location in which an ideal hole in the noiseless reference 
state would move with a velocity u 0 (t). For this purpose v 0 (t) is the steady 
motion with the macroscopic drift or diffusion velocity. Consequently 
5v(t), or u(t) for brevity, will contain the random or Brownian motion that for 
the thermal equilibrium case gives Johnson noise. 

In developing the noise phenomena in Fourier series, we shall focus 
attention on certain elements of volume in the device denoted by A(vol) a , 
A(vol) 0 , etc. These are supposed to be many mean free paths in size, so that 
the random velocity of a hole in one volumeis not significantly correlated with 
its random velocity in another volume. [One of the major steps in the develop¬ 
ment will consist of showing in Eq. (56) that the sum of the effects of these 
elements of volume may be replaced by a volume integral.] 

At any given instant in a volume 

(vol) a = Ax Ay Az (4) 

There is a vector dipole current defined by 

= (5) 

j 

(in some cases we shall imply a vector relationship by such an equation; in 
other cases it will apply to say the x-component only; it is intended that the 
context make this clear). If we consider the effect during a time from t t to 
t 2 , then the disturbance from the ideal steady state is as if hole “j” were 
displaced by 

Ar,- = f u/0 dt (6) 

J tl 

compared to how it would have moved for the ideal noiseless condition. The 
effect is thus as if at the end of the interval -»• t 2 a charge q had been removed 
from where the hole would have gone from t t to t 2 in the noiseless condition 
and replaced with a displacement of Ar,-. This is like setting up a current 
generator between two points Ar,- apart and passing an average current 
ql(t 2 -ty) between them during interval t l -+ t 2 . This corresponds to building 
up a dipole at a rate quj . 

Another way of visualizing this situation is to consider the displacement 
current reaching two grounded parallel planes Ax apart. The current due to a 
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moving charge in vacuum is 5I JX where well-known electrostatic theory (4) 
gives 


<5 I Jx = 


qjUjx 

Ax 


(7) 


and the dipole charge builds up on the plates at a rate 

5P jx = Ax SI jx = q jUj x . (8) 

If the space between the plates is filled with a uniform semiconducting dielec¬ 
tric, it can be seen that the same division of charge occurs. (This can be 
established by noting that the point charge will have the same influence as it 
would have if spread uniformly over a plane; and a plane charge if inserted 
from infinity will induce charges in direct proportion to the admittances on 
both sides.) However, it is not necessary to consider these details since their 
effects are automatically accounted for in the impedance field treatment of 
the next section. 

Each element A(vol) a is thus taken to contain a fluctuation of <5P a that has 
a component in the direction of interest [this direction is identified in 
connection with Eq. (27)] given by 

■5A=?L“/0 in A(vol)„ 

= q^ a y m =qU., (9) 

where a m is a Fourier coefficient of the period T p . The ufit) contribution from 
a given electron persists only while it is in A(vol) a . After it moves out of 
A(vol) a , its fluctuations are accounted for in another element of volume. The 
symbol U a will be used to represent the sum over the holes in the volume. 


III. The Impedance Field 

In order to determine the effects of the 8P a fluctuations on measurable 
noise, we shall next consider some purely macroscopic small-signal aspects of 
a general semiconductor device. For this purpose we consider two physical 
contacts (for example the two terminals of a diode). We ground one of them 
and measure an ac component of voltage denoted by dV N in respect to 
ground on the other ( 5V N will become noise voltage in Sections IV and V). 
We shall refer to this other terminal as “ N.” 

We now imagine, as represented in Fig. 2, a third imaginary point contact 
at a position interior to the body at a vector position r a in respect to some 
fixed origin. If the semiconductor is n-type, we imagine that at this “ contact ” 
we furnish an inward current 5I a which means we extract electrons from that 
point at a rate 5IJ\q\ electrons per second. The return circuit is provided to 
the grounded terminal. 
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Fig. 2. The impedance field: (a) Current S/„ into fictitious terminal “a” produces 
8 V N . (b) Current 8l N produces 8 V„ . (c) Dipole current Sr S/„ produces S V N . 

This disturbance in carrier distribution may alter charge distributions far 
from the point at r a , especially if the structure is an active device with large 
electric fields. It will alter the voltage at “ N.” In keeping with small-signal 
theory wfe may write 

SV N =Z Nx SI a , (10) 

where SI a is now thought of as a complex current vector 5I x0 exp (icot) of 
which the real part represents the current. The impedance factor Z Nx is also a 
complex function of co and r x . 

Next consider what occurs if a current is put in at r x (by removing SIJ\q\ 
electrons/sec at r a ) and removed at a point r' a2 (by injecting the electrons back 
at that point). By linear superposition the resulting 5V N is 

SV N = [Z Nx ( r x ) -Z Nx ( r')] 5I a 

= VZ Nr • <5r 5I a = VZ Nr • <5P a , (11) 

where Z Nr is understood to be a function of co corresponding to voltage pro¬ 
duced at the left subscript by current in at the right subscript and where <5r 
is the (nearly) infinitesimal separation 

<5r = r a — r' 

and the dipole current vector P a is defined as 

SP x =SrSl x . (12) 

Evidently the dipole current vector <5P a is just the rate at which an electric 
dipole vector would increase in strength in free space by a transfer of plus 
charge at rate I x from point r x to point r x . 
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The relationship to the fluctuation considerations of Section II should now 
be apparent. Since 5P ax of that section represents a ratio of transfer of charge 
over a distance Ax it can produce an output voltage calculated from the x- 
component of \Z Nr times the x-component of <5P„. 

The impedance field vector VZ Wr , defined for small signals as discussed in 
the development of Eq. (11), is an essential concept in the treatment of noise 
in this article. It has the dimensions of electric field divided by current or 
ohms/cm. The quantity VZ Nr 'dr = transfer impedance for current between 
r x and r x in producing voltage between ground and “TV.” 

We shall also use the reciprocal transfer impedance VZ r/v *<5r = transfer 
impedance TV to the r a -r' pair. This gives the voltage developed at r xl minus 
that at r x2 per unit current into TV flowing to ground. We shall use this in 
connection with the distributed power theorem of Eq. (16). 

Similarly Z Nt and Z tN represent voltages above ground produced per unit 
current into a and out at ground for Z Nr and into TV and out at ground for 
Z tN . When r x terminates at “ TV,” then Z Nr and Z rN are identical and become 
the impedance Z of the diode: 

For “r” “TV”: Z Nt = Z rN =Z = R + iX, (13) 

where R and X are the real and imaginary parts. 

For the purpose of later comparison with Johnson noise, we shall next 
derive the distributed power theorem which shows a relationship between a 
certain volume integral of the impedance field and R, the real part of the 
impedance between “TV” and ground. For this purpose we consider that the 
specimen can be described by a complex conductance scalar (extensions to 
conductance tensors and to separate hole and electron conductances will not 
be considered here), which may be a function of position so that the total 
current density, including displacement current, is J amp/cm 2 given by 

J = oE = {o r + ia t )E (14) 

where E is the electric field. If a current I 0 exp (icot) flows into “ TV,” then the 
real electric field £ is 

£ = Re \Z rN I 0 e Uot 

I 0 cos (cot) etc. for y and z (15) 

This electric field produces an in-phase current density component that is o r 
times as large as each component of £. 

The average power dissipation (watts/cm 3 ) is thus i\AZ tN \ 2 c r I%; the \ 
comes from <(cos 2 ) and the absolute value squared of a complex vector is the 
sum of the absolute values squared for its three components. 
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If the real part of the impedance between “JV” and ground is R, then the 
total power circuit is \Rll . Equating this power to the integral of the power 
density gives 

R=j |VZ rN |V r </(vol; (16) 

This relationship may be referred to as the distributed power theorem. 

If the specimen is reciprocal so that 

Z tN =Z Nt for thermal equilibrium and B — 0 (17) 

as is the case for thermal equilibrium and no magnetic field, then the reciprocal 
power theorem applies to Z Nr as well as to Z tN and we have 

i?=J|VZj 2 <7 r </(vol). (18) 

This last expression involves the impedance field appropriate for the distri¬ 
buted noise sources that replace <r r in nonequilibrium conditions. Equations 
(16-18) cause the new noise formula of Section VIII to reduce correctly to 
Johnson-Nyquist noise (5,(5) for thermal equilibrium. 

IV. The Principle of Linearity for Significant Deviations 

The types of noise considered in this article are regarded as caused by 
microscopic fluctuations from a noiseless steady state by deviations repre¬ 
sented in general by SF(t). Evidently as defined 5F(t) has a long time average 
value of zero. Examples of SF already discussed are SP and u. Significant 
resulting noise values, denoted in general by 8N(t), are produced at an output 
terminal when a sufficient number of independent microscopic events produce 
a sufficiently large combined effect 8F(t). For such significantly large devia¬ 
tions from steady state, a microscopic linear modulation occurs in the sense 
that the output noise is given by a frequency dependent factor M times 5F. 
This microscopic linear modulation corresponds to transfer impedance 
coefficients like those discussed in the impedance field treatment. 

The principle of significant linearity is the foundation upon which expres¬ 
sions like 5V n = Z Nr 5I a depend, i.e., resulting noise is obtained by multiplying 
the elementary fluctuations by a linear modulating factor which in general is 
a frequency-dependent complex number. 

This principle of linear modulation of significant deviations is particularly 
relevant for the case of space-charge limited shot noise in vacuum tubes. This 
can be seen by considering the insignificant effect of one electron. Obviously 
one extra electron crossing the potential energy maximum cannot modulate 
the current of the other electrons by, for example, 0.983 other electrons. 
However, the noise due to one extra electron would be negligible, i.e., not 
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significant. A significant deviation, occurring by chance of course, of 10,000 
electrons would be a macroscopic disturbance in the linear range and could 
quite reasonably be expected to prevent on the average the passage of 9830 
other electrons. In this case of a significant deviation it is appropriate to say 
that the random current S corresponding to 10,000 electrons is modulated by 
a reduction factor m = 1 —0.983 = 0.017 to produce a final current that 
might be read as 8N which corresponds to M8F. This is the principle of 
linearity for significant deviations * 

V. Spectral Densities 

In this section we shall assume that the fluctuations in an element of volume 
A(vol) a produce a dipole current <5P a which in turn produces an output noise 
voltage 8 V N . We shall assume that the A(vol) a are sufficiently large so that no 
elementary fluctuation, such as a mean free path, overlaps significantly from 
one to another. Consequently, <5P a and 8Pp from two regions are uncorrelated. 

On the other hand a disturbance in A(vol) a caused by <5P a can propagate in 
an active device and produce disturbances in A(vol)^. The effect of these 
disturbances on the output, however, are already included in VZ Nr • 8~P a . 
What is deliberately neglected is any influence that <5P a may have on the 
fluctuations 8P p themselves. This is legitimate in semiconductors where the 
basic fluctuations are collisions, capture, emission, etc., which are not 
significantly affected by noise disturbances from other regions. 

We shall first define spectral density of a fluctuating quantity and prove an 
additivity theorem. For brevity and generality we shall first not treat 

<5Fjv = £ VZ Nr • <5P a (19) 

a 

but instead a generic form 

8N=Y J M j 8F j = Y J b (0 e iw (20) 

where 8N and 8Fj represent fluctuations in general, all of course with the 
fictitious periodicity T p . The equation is symbolic and means that 

is a complex impedancelike quantity so the equation has significance in 
keeping with linear circuit theory conventions. For calculating a real physical 
case we must then have b w = b* a so that <5Fwill be a real function of time. 
This equality that assures reality will automatically occur since the 8Fj are 


* We have tested these ideas on a distinguished author of review articles on noise. His 
reaction does not suggest that they are either banal or unsound; however, his reaction does 
suggest that their significance may not be obvious from the brief discussion of this section. 
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themselves real quantities and consequently the Fourier coefficients a J(0 
satisfy a* m = aj_ m and, furthermore, the impedancelike quantities M satisfy 

Mfco) = M*(-co) (21) 

for any real physical situations. (We exclude possible influences of magnetic 
fields.) Since the value of b m is 

= Z Mj((o)a Jo) , (22) 

j 

the relationship b a = b*. m follows at once. 

The spectral density S(5N, co) is defined like an average power over the 
period T p (denoted by < ) Tp ) per unit frequency as follows: 

sm a» s <(^i°y<») 2 >r, 

_ <(Z2|6Jcos(cot + const)) 2 ) 

A/ 

= ^f^ = 2 <\bJ 2 > T P . (23) 

[The identity is a definition that the average is over one period T p ; the first 
equal sign follows from the reality of 8N; the second from the fact that the 
average over the period T p eliminates cross terms between different co values 
and gives a factor \ from cos 2 ; and the third equality follows from the fact 
that n(Af), the number of frequencies in Af is T p Af so that 1/A/ = T p jn(Af) 
and Z|6J 2 MA/)= (\b a> \ 2 ') (0 .] In this expression ^J 2 )* is an average over 
a narrow frequency band co, which contains many frequencies if T p is large 
enough. (The individual |£J 2 in this range will fluctuate because of effects 
like the off-diagonal terms in Fig. 3 discussed below.) 

The well-known Johnson-Nyquist noise formula (5,6), (SF 2 ) = AkTRAf 
leads to a spectral distribution of 

S(8 V, co) = AkTR. (24) 

We shall show that the general formulation developed in this article leads 
correctly to this expression for the case of thermal equilibrium; the proof 
depends on Z Nl =Z tN which appropriately applies. 

An additivity theorem for independent spectral densities holds for the SF t 
of Eq. (35) provided the individual <5F { are independent of each other, as 
we shall show is true for the 8P a terms. Then in {I^J 2 )*, averaged over 
many frequencies, contributions of terms from 8Fj 8F k of the form 
2i?e[M/co)a > M it *(co)fl* <u ] average to zero leaving only 

= % |M»| 2 <|a,J 2 > a) , 


(25) 
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Fig. 3. The double integral for xj considered as a lower large triangle having t"< t' 
with subdivision into triangles and squares. 

so that since S(5Fj, c o) = 2(\a j J 2 ) (0 T p 

S(5N, co) = £ \Mj(oj)\ 2 S(3Fj, co), (26) 

j 

a relationship which may be called the additivity theorem for spectral densities. 

This additivity theorem of Eq. (26) can be applied to S(5V, co) in terms of 
S(SP a , co). We shall for simplicity assume that VZ Nt is parallel to 5P a . (For 
example, if the surfaces of constant Z Nr are perpendicular to lines of current 
flow in a “ one-dimensional ” structure, then the relevant 5P a is the component 
parallel to current flow.) We shall, furthermore, argue that, because of the size 
of the A(vol) a regions compared to a mean free path of a carrier, there is no 
correlation between the 5P a of one region and 5Pp of another. Consequently 
from 

5V N = ^VZ Nt dP a (27) 

a 

we can conclude from the additivity theorem that 

S(dV N ,oo) = £ \Z Nr \ 2 S(dP a , co). (28) 

OL 

This reduces the noise calculation problem to finding a systematic way of 
calculating S{5P a , to) and carrying out the sum over the A(vol) a . 

In keeping with 5Fj above, we consider the expansion of 5P a in a Fourier 
series with period T p : 


=qUft\ 


( 29 ) 
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where Uft) is introduced for convenience in writing equations: 

Uft) = X U M) for A(vol) a . (30) 

The spectral density of the dipole current is thus 

S(SP a , co) = 2q 2 (\aJ 2 } m T p . (31) 

The problem is to evaluate <|a 0) | 2 > 0) from known properties of the uft) as 
they are summed in U a . Since the U a is real, a m and a^. m are complex conju¬ 
gates and as Fourier coefficients are determined by 

T 

T p a (O = T p at (0 = \ ' 'e~ i(OI Uft) dt. (32) 

J o 

The desired quantity |aj 2 is given by 

T T 

T 2 p \aJ 2 = f P f P e^'-^U ft')Uft") dt' dt". (33) 

This double integral expression is the subject of the next section. 


VI. Autocorrelation, Diffusion, and Noise 

A. The autocorrelation function 

The autocorrelation function is the basic concept used in evaluating the 
double integral of Section V for r p 2 |aj 2 . We shall discuss the autocorrelation 
function first not for the case of Uft) but instead for uft) for one carrier. The 
autocorrelation function for uft) as we shall use it is an average over all the 
carriers (either holes or electrons, not both) that are in A(vol) a during one 
period T p . We denote the function as u 2 where the subscript c indicates 
“ correlation 

U 2 ft' - t") =<uft')uft")yj = u 2 ft" - t'). (34) 

This average over many carriers “y ” ma Y be replaced by an average over one 
carrier for a long time provided the carrier stays in a similar environment. In 
calculating S(dP a , co) the average must in principle be over the many “y” 
carriers that spend varying times Tj in A(vol) a . In an active semiconductor 
device, the behavior of any one carrier may be quite different in A(vol)^ than 
it was in A(vol) a ; for example, it may be in a higher electric field and become 
hot so that u 2 (t) may change substantially. 

As we shall show below, the important attribute for calculating noise is 
what we shall call the diffusion of u at frequency co: 

r 00 

D(u, co) — Re \ e ,<ot u 2 (t) dt. 

*'o 


( 35 ) 


550 


W. SHOCKLEY, JOHN A. COPELAND, AND R. P. JAMES 


For the simple case in which u represents the ^-component of random velocity 
of Brownian movement for which the ideal noiseless state is the particle at 
rest, D(u, 0) reduces to the diffusion constant. 

A correlation time x c can be defined in terms of the diffusion constant 
D(u, 0) and the mean squared random velocity: 

D(u, 0) = f u 2 (s) ds = u 2 (0)x c = (u])jx c , (36) 

Jo 

where the definition of the mean square follows from (34). For many cases 
of interest u 2 (s) is (u 2 > y , or <w 2 > for short, times an exponential decay factor 
exp( — tl t mft ) where i mft is the mean free time. For this case x c defined by (36) 
equals r mft and <w 2 > = <u 2 ) = < v 2 }/3 where v is the speed of the particle. 
These relationships lead to the familiar result D = <r 2 > x m( J3. 

In calculating noise we shall be concerned with more general cases such 
as “intervalley” scattering in multivalley semiconductors and trapping; for 
these, Uj(t) will represent deviations from the uniform, orderly, average, 
noiseless v 0 (t). Many aspects of these cases are so closely related to the simple 
case of diffusion that we shall treat the latter in detail. 

B. Diffusion 


Accordingly we consider how a group of particles having velocities Uj(t) 
along the x-axis are spread out at a time t = T having all been at x = 0 at 
t = 0. Evidently 


•5 

aP 

‘I_c 

II 

K > 

(37) 

and 


T T 

X J=( I (t-)dSdf, 

J o J o 

(38) 

and, if this is averaged over many particles, 


T T 

<Xj>j = f f u 2 (t' -t")dt' dt" 

J 0 ^0 

(39) 


since the order of integration and averaging can be interchanged. Before 
considering the last autocorrelation integral, we shall consider the u {t')u{t") 
integral of Eq. (38) in detail for a very simple case: 3 3 

Suppose T — nx 0 and u is constant throughout each subinterval x 0 with a 
value which is either -I- i/g or Uq determined by chance. (This is suggestive of 
a mean free time x 0 and mean free path l 0 = u 0 x 0 . As we show below the 
correlation time x c is t 0 /2.) Then the integral for x y becomes 
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where a js = ±1. [The relationship of this to the binomial distribution is 
discussed following Eq. (45).] The double integral then can be divided into a 
checkerboard as shown in Fig. 3. Furthermore, since u{t')u{t") is symmetrical 
about the diagonal t" = t' only the lower triangle t" < t' of the large T x T 
square need be considered. This contains n little triangles each of area t^/2 
and of contribution + ll\2 = T 0 to the integral and (n 2 - n)/2 little squares 
each giving ±ll = S { . The value of xj is then twice the lower big triangle: 

Xj=2^nT 0 + £S f J. (41) 

If this is averaged over many carriers, the S t terms average to zero so that the 
entire contribution comes from the diagonal strip of little triangles giving 

<*;>/ = 2nT o = 2 Q (yj = 2 t(^J = 2TD, (42) 


the last forms being written for aid in identifying the diffusion constant. 

The relationship <x 2 > = 2DT is well known and readily established for 
particle diffusion in an unbounded medium with diffusion constant D. The 
proof is as follows: If the concentration is c(x, t ), and $c(x, t) dx = 1, then 
integration by parts readily establishes 


djfc 2 } 

dt 



x 2 c(x,t)dx 


pm 


dx = D 



dx 


= 2D jc dx = 2D, 


(43) 


so that in Eq. (42) it is evident that WqT 0 /2 must represent the diffusion con¬ 
stant due to u. [A trivial pedagogical extension is to consider a uniform 
concentration gradient and show that for the r G u 0 = l 0 model the flux is 
(ulz 0 l2)dc/dx.] 

For this case the autocorrelation function is easily seen to be [(t 0 — t)jz 0 ]ul 
for t < r 0 so that D(u, 0) is ulx 0 j2 and the correlation time t c = t 0 /2. This 
agrees with Eq. (42). 

The value of xj for any one particle may, of course, differ greatly from the 
average < xj}j . The spread in xj can be found from (xf)j compared to 
(xj)j; squaring Eq. (41) gives 


4 = 4 [« 2 ^ + 2 n 0 T 0 £S,- + IZ S t S k 
L i i k 


(44) 


Averaging over all carriers leaves the first constant term unchanged, elimi¬ 
nates the second sum, and in the double sum leaves only the Sf = 1% = 47^ 
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terms for the {n 2 -n)j2 squares of Fig. 3. Consequently we obtain 
<x‘)j = 4[* ! r 0 2 + 2(„ 2 - n)T 2 ] 

= 3<*J>J-(;)<*}>?• («) 

Thus the mean square deviation of x 2 for large n becomes 

(xpj ~ <xj>j = 2<xj)j , (46) 

showing that the off-diagonal terms account for a large spread in xj although 
they do not contribute to (xj)j ■ This large spread in xj, which gives a rms 
spread of ^(x 2 }, is exactly that given by a Gaussian distribution. [This 
result is expected because our simplest model gives a final distribution of 
particles that is simply the binomial coefficient (") where r = (— nl 0 + x)/n 
since at t = T = nz 0 the particles are at integral multiples of l 0 and the 
extreme positions are ±nl 0 .] 

With Eq. (46) and its interpretation we conclude our use of the r 0 u 0 = l 0 
model. The important features related to the roles of diagonal and nondiag¬ 
onal terms will be seen to have significance for the more general cases con¬ 
sidered next. 

The result that D(u, 0) of Eq. (35) is the diffusion constant applies in general 
to any u(t ) that has zero average value and corresponds to a disturbance with a 
finite correlation time so that uj(t) vanishes rapidly for large t. This can be 
seen by integrating Eq. (39) over the lower triangle t" < t' of Fig. 3. If the 
integration is carried out first over dt, the result is 

<• xj>j = 2 j^f 7 u 2 c (C - (") dt\dt" = 2 J T D(u, 0) dt" = 2TD(u, 0), (47) 

which shows the diffusion relationship in general. 

(The factor 3 in 3(xj) 2 in Eq. (45) for xj also follows in general by express¬ 
ing xj as a four-dimensional integral from 0 to I for each of four t’s, say 

t 2 , t 3 , t 4 , the integrand being the products of the four corresponding 
m/s. When the average over is carried out, significant values for the 
intergrand result only if pairs of the t's differ by ~t c ; this leads to three 
pairings: [(r 4 t 2 )(t 3 /*)], [(/, t 3 )(t 2 t 4 )], and [(/, t 4 )(t 2 / 3 )], and for each the 
integration gives 4 D\u, 0 )T 2 which leads to Eq. (45).) 

For completeness we shall compare expression Eq. (47) for D with the 
familiar case of carriers having an isotropic effective mass m and a mean free 
time t that is a function of energy only. For this case if the velocity at one 
time has x component u x , then at a time t later it will on the average have 
u x e~ t,x so that 


*&(')=<*& ,/f >all» 


(48) 
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where the average is over all carriers. The diffusion constant along the x-axis 
is then obtained by interchanging the order of integration and averaging: 

D(v x , 0) = £ (v 2 e~ ,/x ) dt = <u 2 r> all = (49) 

That this does fit the Einstein relation is seen from the fact that the mobility 
p. for this model is given by the well-known formula with averaged mean free 
time weighted by v 2 


q{v 2 r> 






(50) 


and <u 2 > = 3<u 2 > so that 3 kT = <u 2 )m = 3 (v 2 )m from which the result 
readily follows. 

An important difference between the case of the ordinary diffusion constant 
and the D(u, co) of this paper is that the correlation times of trapped carriers 
or intervalley scattering may be long enough that the dependence of D(u, co) 
upon co may be significant in an experimentally accessible range. 


C. Application to noise 

Returning to T 2 \aJ 2 needed for S(dP a , co), we note that the integral over 
Uofc'Wfd") is of the form already considered for uft')uft"). We can thus 
conclude that, for the averages needed to calculate noise, only the diagonal 
strip, corresponding to the triangles of Fig. 3 need be considered. In terms of 
the Uj(t ) the expression of Eq. (33) for T 2 |aJ 2 is 


Tl k 


= II \ TP ( TP e i(0(l '- n Uj(t')u k (Odt'dt". (51) 

j k J 0 * , 0 


In this double sum uncorrelated terms involving u } and u k products, although 
they may contribute significantly to any particular \aj\ 2 , will average to zero 
for calculating noise. Hence the double sum reduces to a single sum of 
Uj{t')uj(t"), and since these are summed over all the electrons the integrand 
e i(oW-t" ) Uj(t')uJ(jt") can be replaced by its average value e ,(o(, ~ t " ) u 2 (t' — t"). 
Since this integrand changes to its complex conjugate when t' and t" are 
interchanged, we can by reasoning like that used for Fig. (3) and Eq. (45) 
take twice the real part of the integral over the big triangle t" < t' . This 
procedure leads to terms like Eq. (45) for each “y” with Tj the time each 
carrier stays in A(vol) a replacing T so that 


T p \aJ 2 = Y.Tj2Re f e^u 2 c (s) ds 
j J 0 

= 2 Dfu, co ) I T,. 


(52) 
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Noting that if the density of carriers in A(vol) a is n a the sum of times T) must 
be T p n a A(vo\) a , we can evaluate as follows 

T 2 p \aJ 2 = 2D a (u,w)YTj 

= 2D a (u, co)n a A(\o\) a T p , 

S(dP, co) = q 2 2T p \aJ 2 = 4q 2 Dfu, cu)« a A(vol) a . (53) 

The last form indicates that spectral density S(dP, co) for A(vol) a can be taken 
to be a spectral density per unit volume 

S v (5P a , co) = 4q 2 D a (u, co)n a (54) 

times the volume. 

This establishment of a volume density for the spectral density permits us to 
discard the A(vol) a set and replace the sum of S(5P a , co) in S(dV N , co) by an 
integral of S v . To be consistent we shall drop the subscript a and change 
notation as follows: 

S v (5P a ,co)->S v ; D a (u, co)-> D; n a -+n, (55) 

where the dependence of the new symbols upon r and co is implicitly under¬ 
stood. 

The sum for S(S V N , co) can then be written as 

5(5F N ,co)=J|VZ Nr | 2 5 t ,4vol) 

= j\VZ Nt \ 2 4q 2 Dnd(vo\). (56) 

This expression formally reduces the problem of calculating noise in a semi¬ 
conductor device into two problems: (a) calculation of the impedance field 
VZtf r corresponding to introducing dipole current sources as discussed in 
Section III and (b) calculating the spectral density of the elementary fluctua¬ 
tions as 5P terms as S v spatial densities. 

In the next section we give the formulation as a basic test by verifying that 
it leads to the Johnson-Nyquist result for thermal equilibrium conditions. 

VII. The Diffusion-Impedance Field Noise Formula 

The integral derived in the last section attributes the cause of the noise to 
the diffusion sources S(dP a , co) = 4q 2 D a (u, co)n x modulated through the 
impedance field to give the output voltage. We shall refer to it as the 
diffusion-impedance field noise formula : 

S(5V n , co) = j\VZ Nr \ 2 4q 2 Dn d{\ ol). 


(57) 
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One test of the diffusion-impedance field noise formula consists of applying 
it to obtain the Johnson-Nyquist result S(dV N , w) = 4kTR. This result is 
established by using the Einstein relationship to obtain 

q 2 Dn =q 2 i^^jxn = kTo r , (58) 

where o r is the ohmic conductivity of Eq. (14). 

For the thermal equilibrium case Z Nr = Z rN . This permits rewriting the 
noise formula to obtain 

S(SV N ,co) = j \VZ Nr \ 2 4q 2 Dn d(y ol) 

= 4 kT j |VZ rJV |<r r 4vol) 

= 4kTR, (59) 

in keeping with the distributed power theorem of Eqs. (17) and (19). 

VIII. Trapping and Intervalley Scattering 

We shall next evaluate the spectral density of a noise source for carriers 
drifting under an electric field E along the x-axis in a volume A(vol) = 
Ax Ay Az. We shall suppose that the carrier density is n and that the carriers 
may be in either of two conditions; on the average a fraction a of the carriers 
are in an “a” condition and /? are in a “b” condition. These conditions may 
be a= mobile, b = trapped for one example or a = fast valley, h=slow 
valley for another. We shall also denote the numbers of the carriers by a and b. 

a = an A(vol), (60) 

b=fin A(vol). (61) 

We shall suppose that the carriers in condition a have a microscopic drift 
velocity along the a; axis of v a and fluctuations u a about this and similarly 
define v h and u b . The average drift velocity is 

v = av a + /3v b , (62) 

and the value of u is 

u =/?A + u a — v a — v + u a for “a” carriers 
u= —aA + u b = v b — v + u b for “b” carriers, 

where 


A = v a -v b . 


( 63 ) 
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We suppose that a carrier makes many transitions between the two conditions 
while passing through A(vol). 

We shall first assume no interference effects, such as limited availability of 
traps, and take v a as the probability of transition from a to b and v b for b to a. 
Then if the number of carriers are not at their steady-state values 

a = - v a a + v b b, (64) 

b = v a a - v b b. (65) 

Multiplying the equations by v a and v b and subtracting gives 

-j- (v a a - v b b) = —v(v a a - v h b), (66) 

at 


where v 


1 

V = V a + v b = - 
1 


(67) 


is obviously the decay constant for any disturbance from the steady-state 
condition with v a a — v b b = 0. Evidently 


a = 


v* 


/? = 


(v« + v*) ’ r (v a + V b ) 

In terms of this model we shall show that if v is much less than 1 


( 68 ) 


Uc(s)-ctpA 2 e vs +au 2 a (s) = pu 2 a (s). (69) 

To prove this we note that u(t')u(t' + s) includes four cases classified according 
to whether the carrier is in condition “a” or “6” at /' and whether the carrier 
is in condition “a” or “6” at /' + s. 

In order to calculate u 2 (s) for this model we assume that the relaxation 
times x ca and r cb for u a and u b are much less than the decay time 1/v for 
transition. Hence in u{t')u(t'+ s) the contribution from u a (t')u a {t' + 5 ) is 
found a fraction a of the time and u b (t')u b (t' + 5 ) a fraction /?. 

If s is comparable to x ca or x cb the chance that «(/') and u(t") correspond to 
different conditions is t C(J v or t c6 v, and this is assumed negligible. Hence u a 
and u b contribute 0 Lu 2 ca (s ) and puj b (s) to u 2 (s). 

The effect of the A term may be seen simply. Suppose we consider a sample 
of many carriers all in condition a with u = pA at / = 0. Their average value of 
11 decays to zero as /?A exp(— vt). Hence for them <w(0)w(/)> = /? 2 A 2 exp( — vt). 
The chance of finding a carrier in state a is a however, so the contribution 
that these carriers make to u 2 (s) is a/? 2 A 2 exp( — vT). A similar contribution 
comes from carriers initially in condition b. 
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Combining all these contributions leads to 

u 2 c {s) =oc/?A 2 e~ vs + otu 2 a (s) + fiu 2 b (s). (70) 

The resulting value for D(u, co) is 

. oc/?A 2 t 

D(u ’ w) = r+ '(ojx ) 2 + aD{u ° ’ w) + P D{ltb ’ w) - v 

This expression for D may be used in the diffusion-impedance field formula. 

Appendix A. The Reciprocity Theorem for Imrefs 

The results of the previous sections can be extended to cases where holes and 
electrons are not in equilibrium with each other or with traps or recombina¬ 
tion centers by making use of the imref of quasi-Fermi level. The method used 
here introduces a set of imrefs: cp p for holes, <p n for electrons, <p t for a trap. 
For the situation shown in Fig. 4 all of these (p’s represent small disturbances 

^_Lp Ip 1 o 




Fig. 4. Conditions considered in proof of reciprocity theorem for imrefs: (a) Hole 
current I p at r p results in imref <p n at r„ . (b) Electron current I‘„ at r„ results in imref <p' 
at r p . 

from <p = 0, the thermal equilibrium level. The free hole charge density p p for 
small disturbances from the equilibrium (subscript 0) condition is 

P p = exp[(</> p - F,)/F 0 ] = Pp0 exp[(<p p - (p e )/V e ] 

= PpO + ~ (<Pp ~ Ve), Wp- (Pe\< V e , (Al) 

V 0 

where (p e is the small signal disturbance in the intrinsic level V t where 

Vi — ViO +Ve • 


(A2) 
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The free electron charge density is similarly 

Pn = ~P n i ex P[(V i - <p n )/Ve] = p n0 exp [((p e - <Pn)/Ve ] 

= P,o + (*• - «>">> l<P« - <p»l . (A3) 

Vo 

The imrefs lead to current densities of the form 

J p = —fippp'Vcpp = — <y p V(p p = <JpE p , (A4) 

J„ = + PnPn^Vn = ~ °nV<Pn = , (A5) 

where the E’s are effectively electric fields that would be measured using 
pairs of p+ and n+ probes. 

The dielectric displacement current density J e due to rate of changes of 
E e = — Vq> e , the small signal electric field, is 

J e = icoKE e = - ia)KV(p e = o e E e , (A6) 

where K is the permittivity in farads/cm. This notation enables us usefully to 
write for the total current density 

J = J p + J„ + J e = I a J a (A7) 

where, of course 

V-J = 0. (A8) 

The concepts just discussed for currents and imrefs are used in deriving a 
reciprocity theorem for imrefs for situations like that represented in Fig. 4. 
Here two conditions are represented: 

1. Unprimed. A hole current I p is injected at a vector position r p and flows 
out at ground producing an imref <p n for electrons at position r„. Evidently for 
small signals (p n must be proportional to I p times a function of the two position 
vectors: 


4>n — Z(r„; r p )I p , (A9) 

where the semicolon separates the position and type of the output voltage, to 
the left, from that of the input, to the right. 

2. Primed. An electron current l‘ n is introduced at r„ producing an imref 
(p'p at r p , For this condition: 


<p' r = z (r,;rj/;. 


(A10) 
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Our problem is to prove from the same equations used for semiconductor 
device analysis that these two Z’s are equal. [This result also follows, of 
course, from much more general considerations—Onsager’s principle of 
microscopic reversibility (7). However, we wish to check the imref methodo¬ 
logy for future use in noise calculations.] 

If the “two-vector” Z’s are reciprocal, the same will be true for two pairs 
of circuits. Thus if Z(r l5 r 2 ; r 3 , r 4 ) or more compactly Z(12, 34), is the 
voltage developed between r x and r 2 per unit current into r 3 and out of r 4 , 
then we can prove Z(12, 34) = Z(34, 12) as follows: 

Z(12, 34) = Z(l, 3) - Z(2, 3) - Z(l, 4) + Z(2, 4) 

= Z(3, 1) - Z(3, 2) - Z(4, 1) + Z(4, 2) 

= Z(34, 12). (All) 

In order to prove this theorem we introduce a vector H 

H = £aOa J a ~ Va^a) (A12) 

where the three <p ’s and J’s are functions of position in the specimen, and 
apply Gauss’ theorem to Fig. 4 taking as the surface the outer surface of the 
specimen (for this purpose imagine all surface recombination centres and 
traps are moved 18 A inside so that the J’s are tangential), a surface just 
inside “ JV”, a surface just inside the ground contact, and two small spheres 
around r„ and r p . The integral of H • dS (outward normal) gives zero on all but 
the small spheres at r„ and r p ; either because all (p’s are zero, or all (p’s are 
equal (at “A”) and JJ-tfS gives zero, or J„'dS = J p -dS = J e -dS = 0 on the 
free surface. From the spheres at r„ and r p we get the only contributions and 
these are found to give 

jH-rfS = + <p' r (j p )h = -Z<r„; r,)I,K + Z(r„; r,)O p 

= Jv-Hrf(vol), (A13) 

the last equality being Gauss’ theorem. 

For a linear disturbance to a semiconductor in thermal equilibrium with no 
field the divergence in (A 13) vanishes (as we shall show below) so that the 
two-point reciprocity is established. To show that V *H = 0 we make use of 
symmetry relationships in the elementary process. First we note that V • H has 
one part of the form V<p • J and another of the form <pV • J. The first part can 
be rewritten as 

Z a (E a a a E’ - E' a o a E a ). (A14) 

This is clearly zero if o a is a scalar and even if a a is a symmetrical tensor, as it 
should be in the absence of a magnetic field. 
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The </>V • J terms are more complicated. The proof they give zero depends 
on showing that the V • J’s are related to the <p’s by an admittance matrix with 
coefficients A aP having dimensions of mho/cm 3 : 

V , J* = Yp A aP ( Pp (A 15) 

and that A a/3 = Ap a from which it follows that the +<pV*J and —cpV-J terms 
in V’H cancel so that V*H = 0. 

The proof of the reciprocity theorem for the Z’s can thus be completed by 
showing that the six nondiagonal matrix elements satisfy A pn — A np , 
Ape = A ep , and A ne = A en . We shall prove these equalities for a simplified 
model having one acceptor trap and a direct recombination mechanism 
(or one so fast that storage in the centers is negligible). For this model we 
must consider three varying charge densities; in addition to p n and p p , the 
charge density of holes on traps varies from its thermal equilibrium value 
Ppto by an ac component 8p p , (8): 

$Pp, = [p Pro (l - f P to)/Ve] ( <P, ~ <Pe) = C,(<Pt ~ <Pe), (A 16) 

where f pt0 is the fraction of traps that have holes on them at thermal equilib¬ 
rium. C, is defined by the identity and has the dimensions of farads/cm 3 . If 
the thermal equilibrium rate of emission of the charge density of holes from 
traps is e p p pt0 , then the net rate of capture of hole charge on traps is 

<5 P p , = io)8p p , = (< qe p p pt0 /V 9 ) (<p p - <p,) 

= A,(<Pp - <p,), (A 17) 

where the admittance A, in mho/cm 3 is defined by the identity. Eliminating 
cp, , we obtain 

$Ppt = [Cr/O + ®h)] ( <Pp ~ <Pe ) = C t (<o) ((p p - (p e ), (A 18) 

where C, is defined by the identity and 

T r = C t /A,. (A19) 

The rate of generation of hole charge density through recombination centers 
may be written for a fast recombination model as A g ((p n -(p p ) so that the 
equation of continuity for hole charge becomes 

V ‘ J P = ~Pp ~ S Ppt + Ag((Pn ~ <Pp) 

= - Hippo/ Vo) + C t ] {(Pp - cp e ) + A g (<p n - (Pp), (A20) 

in which the p p0 term comes directly from (Al). This equation gives A pe and 
A p „; for example A pn = A g . Similarly 

V • J« = - H-Pno/Ve) ( <p n - (Pe) + A g {(p p - cp n ), 


(A21) 
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which gives A ne and A np ; A np satisfies the symmetry condition A np — A g — 
A pn . The A en and A ep matrix elements are obtained from (A8): 

V'J e = —V-J p —V-J n 

= i<o[(ppJV e ) + C ( (co)] {(p p - (p e ) + i(o(-pjV e ) (<p„ - <p e ). (A22) 

Inspection shows that the necessary symmetry is preserved through the 
appearance of the differences of (p's always in the form A,f(p a — (p fi ) in V • J a 
and minus the same form in V -J p so that —A afi (cp a — (pf) = A Pa (<pp — cp a ) 
and A aji = A fia . It is intuitively clear that generalization of the model con¬ 
sidered here will still preserve the symmetry necessary to establish the 
reciprocity of the Z’s. 

Appendix B. Field of a One-Dimensional Sample with Carrier Drift 

The following model can be applied to the GaAs transit time diode. Con¬ 
sider a rod of semiconductor with a uniform majority carrier current flowing 
in the x direction with drift velocity v(x) where the x coordinate varies from 
0 to L. The sample is assumed uniform in the y and z coordinates. 

If at time t a quantity of charge bQ is removed from the drifting carrier 
stream at point x and replaced at point x +<5x, the effect is that of creating a 
dipole layer which initially affects the electric field only at points between x 
and x + bx (see Fig. 5). As time passes two things happen: (a) the dipole 
layer drifts toward the drain contact (x = L), and (b) the voltage step created 
by the dipole layer bV changes according to 


bv= -vbV, 

(Bl) 

v = 

K ’ 

(B2) 

I OR SO 
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Fig. 5. Drifting dipole layer interpretation of disturbance in stream of drifting carriers. 
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where p f is the fixed charge, p* is the differential mobility, and K is the dielec¬ 
tric permittivity. Equation (B2) can be derived from Eq. (4.16) of Ref. (9) by 
considering the effect of space charge on the length of the drifting dipole. 
Because of the carrier drift, the dipole layer may drift into regions where the 
dielectric relaxation constant v is significantly different. In the GaAs transit 
time diode, v may change sign as the electric field varies along a. 

To discuss the development of a disturbance in the drifting carrier stream, it 
is convenient to associate with each point, a, a timelike variable, s, which is 
the time required for a point fixed on the carrier stream to drift from the 
source contact (x = 0) to the point a. 

The transit time T is equal to s(L). Since s for this case is a monotonic function 
of x, any function of x can also be expressed as a function of s. The relative 
change in voltage of drifting dipole layer is described by a dimensionless 
function (p(s) defined by 

<p( 0) = 1, 

— <p(s) = - v(s)(p(s). (B3) 


In terms of q>(s) an expression for the voltage, bV, across the whole sample 
resulting from the displacement of charge 5Q across the segment a to a +5x 
as a function of time can be easily found. The initial voltage is the same as 
would be produced by putting charge bQ on a parallel plate capacitor with 
capacity KAjbx. The relative change in voltage with time is given by <p(s(x) + 
t)j(p(s(x)). The voltage is then given by 


<5 V(t) = 


bQbx (p(s(x) + 1) 
KA <p(s(x)) 


0 <t<T-s(x) 


= 0, t>T-s(x) (B4) 

The impedance field gradient Z x (x, co) is defined as the ratio of voltage pro¬ 
duced at the end terminals to the alternating current inserted through the 
vanishingly small segment between x and x +bx divided by bx. This differen¬ 
tial transfer impedance can be found as a function of frequency by Fourier 
analysis of the current pulse bQb(t), where <5(/) is the Dirac impulse function, 
and the resulting voltage b V(t) just considered 



e icot I(co) dco, 


(B5) 


7(o>) = bQ 


(B6) 
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and 



e iwt V(oj) doj, 


(B7) 


V(co) = f°° e~ im V(t)dt 
l -oo 

SQ Sx i ‘ T ~ s ( x ) e~“ ot (p(s(x) + t) 


KA 


L 


<p(s(x)) 


dt 


SQ Sx e ims (x) C T _. 


e tcot q>(t ) dt. 

d 


KA<p(s(x )) J s(x) 

Now the differential transfer impedance or impedance field gradient is 

e io3S (x) 


i. Z ( X>C0 ) = JM. 

dx /(co) <5x KA<p(s(x)) J s(x) 


1, 


e~ 1031 cp{t) dt. 


(B8) 


(B9) 


The impedance of the whole device may be found by considering the voltage 
produced at the contacts when the same current is flowing through all seg¬ 
ments of the sample. The result is 

d 


Z(co) = f — Z(x, co) dx 

J Q OX 


J_ r L e i<os (x) r T 
KA 'o <p(s(x)) J S (x) 


e ,b3t q)(t) dt dx. 


(BIO) 


Equation (B9) gives the impedance field gradient which is used in Eq. (53) 
to calculate the open circuit output noise voltage. The short circuit noise 
current can then be calculated by using the impedance given by Eq. (BIO). 
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I. Introduction 

The literature on the theory of excitons in molecular crystals can be 
divided into two classes according to whether the exciton is assumed to be of 
infinite extent or about the size of one molecule. The first assumption is usual 
when treating spectral properties, e.g., Silbey et al. (1965), and the second 
when treating conductivity properties, e.g., Northrop and Simpson (1956). 
The difference between these assumptions is not just the trivial one of the 
volume disturbed since the long-range contributions to the excitation energies, 
which are substantial, are present on the first assumption but absent on the 
second. The principal object of this paper is to indicate how this question may 
be resolved. It begins with a summary of the application to excitons of the 
long-wave theory which has been developed in order to deal effectively with 
long-range effects and shows how the introduction of a finite lifetime into the 
theory modifies it. In particular, the form of the exciton created during 
light absorption is contrasted with the form of the excitons involved in 
subsequent effects. 


II. The Long-Wave Theory 

The basic difficulty in calculating the excited states of a molecular crystal is 
that of evaluating the conditionally convergent sums of the dipolar interactions 
between excited molecules. The same difficulty arises in considering the 
vibrations of an ionic crystal. One solution of this problem is the long-wave 
theory described in detail by Born and Huang (1954) and summarized 
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elegantly by Slater (1960). This theory uses the approximation that the wave¬ 
length of the waves is much greater than the separation between nearest 
dipoles and this is certainly valid in the molecular crystal problem. The appli¬ 
cation of the theory to the molecular exciton problem has been given by Hall 
(1958, 1962) and Amos (1963) and will be described here as the limit of a 
sequence of theories of gradually increasing complexity. 

The first attempts to account for the spectra of molecular crystals in terms 
of the excited states of the free molecules were made by Davydov (1962) and 
are summarized in his book. In these, the interaction between a molecule and 
its immediate neighbors is the only one included in the theory. These inter¬ 
actions produce wide bands of excited states, but only a few of these states 
are involved in light absorption because of the conservation of crystal 
momentum between the photon and the exciton so that the spectrum of the 
crystal consists of sharp lines. If the excitation energy of a free molecule is U 
then the interaction energy between one molecule and all of its neighbors 
inside a sphere can be expressed in the form of an additional term which is 
the scalar product of the dipolar transition moment T and an effective electric 
field I due to the remaining molecules inside the sphere. The very short 
range forces between nearest neighbors can also be included by modify¬ 
ing U. 

The excitation energy is then 

W = U — T*I, (1) 

and in the calculations of Craig and Hobbins (1955) and Craig and Walsh 
(1958) all molecules within a sphere of radius 20 A contribute to the effective 
field I. The apparent convergence of the lattice sums for I is misleading, 
however, since this is a property of the sphere and not of the infinite sums. 
When the sums are extended to the infinite limit there is an additional polari¬ 
zation contribution of Lorenz-Lorentz type which, for transverse excitons, 
gives 

W = C — T*I — (4n/3v)T 2 , (2) 

where v is the volume associated with each molecule. The calculations of 
Silbey, Jortner, and Rice show that this extra term can sometimes become 
very large. 

The use of the electric field formalism in this classical way as a method of 
calculating the intermolecular forces suggests that another effect needs to be 
introduced into the theory. This is the polarization of the medium between 
the dipoles which is described by a dielectric constant or, alternatively, by the 
polarizability a. This effect can be evaluated electrostatically and changes the 
excitation energy to 


W = U — T*I — F(1 — |7ra) *, 


(3) 
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where 


V — %n(T 2 /v + aT’I). (4) 

In all these refinements the electric field is treated as an electrostatic field 
distinct from the electromagnetic field of the photon. This is an acceptable 
approximation so long as the two are far from resonance, but when consider¬ 
ing photon absorption we are concerned precisely with resonance since the 
fields must be given wave vectors k and angular frequencies co which are the 
same. The full electromagnetic theory then has a composite solution in which 
the energy quantum is present partly in the form of an exciton and partly in 
the electromagnetic field. This electromagnetic exciton is similar in conception 
to the polariton described by Fano (1956, 1960) and Hopfield (1958) and 
differs primarily in that it retains the use of the classical electromagnetic 
field whereas polariton theory uses second quantization both for the exciton 
field and the electromagnetic field. Although the polariton formalism can 
ultimately be extended to include all the effects listed above it is rather more 
difficult to do so because of self-energy terms. The equation for the excitation 
energy that embodies the effect of this electromagnetic coupling of the exciton 
with the photon is given by Amos (1963) as 


_ t T / /'i c2 ^ 2 \l 

(/ 8 > 

k c 2 k 2 l 

4 U _1 

T-I+ V\2 -\ -—) 

1 +-na\ -^ 

1 — - na } 

\ co 2 /I 

l\ 3 ) 

’ CO * \ 

3 / J 


but this is inaccurate since negative frequency terms are omitted. These 
terms are important theoretically, as has been pointed out by Fowler (1964) 
and Ball and McLachlan (1964), because they lead to a dispersion relation 
which satisfies the causality principle but, in practice, their quantitative 
effect is small. The corrected formula becomes 


W 2 = (U- TI) : 


+ 2V(U _ t-d( 2+ ^){(i + !«)-!£(i— 5 


na 


-l 


( 6 ) 


Although the theory now incorporates most of the significant effects 
related to the electromagnetic field it has still to be modified to allow for the 
finite lifetime of the excited state. This is not a classical effect but it can be 
brought into the classical theory by making the energy complex with a 
negative imaginary part. Thus U is replaced by U — \iy, where y is the width 
of the band or h/y the lifetime of the excited state, and this leads to the same 
formulas as those of the Weisskopf-Wigner theory (1930). In the region of 
anomalous dispersion, which is also the region of light absorption, this 
modification makes the theory very much more realistic. Thus, the dielectric 
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constant at the resonant frequency becomes finite and the reflexion coefficient 
ceases to be unity. 

The theory requires some modification, particularly to the definition of the 
transition moment T, when there are several molecules of the same species 
per unit cell. Amos (1963) has made these changes and applied the theory to 
the naphthalene band around 2200 A and the anthracene band around 2500 
A, but experimental results are not sufficient to provide an adequate test of 
the theory. It is especially difficult to obtain independent estimates of a and y. 

III. The Exciton Lifetime 

The device, adopted above, of introducing the exciton lifetime into the 
theory by means of a complex energy is an approximation to the true situation 
in which the exciton is also coupled to the phonon degrees of freedom. In 
normal circumstances, this coupling means that individual vibronic transitions 
have to be considered and the width of these is very small. Nevertheless, the 
observed absorption is a superposition of many vibrational peaks, and it may 
be argued that it is their envelope and its width which are important. In his 
study of hypochromism, Nesbet (1964) has shown that if the band shape is 
rectangular this result follows so that the value of y is the full Franck-Condon 
width. Nesbet uses a width of 0.74 eV for ethylene, and this is a typical value 
for all conjugated systems. It corresponds to a lifetime of 10“ 15 sec. 

An immediate consequence of this large effective value of y can be seen in 
Figure 2 of Amos’ paper which is drawn to illustrate the absorption when y is 
large. The absorption is centered on an excitation energy which is now 
identical with that obtained in the original theory which included only the 
interactions from neighbors lying inside a sphere of definite radius. Thus, 
in this indirect way, the calculations by Craig et al. using a sphere can be 
justified for those transitions having large values of y. 

IV. Exciton Size 

Another consequence of the short exciton lifetime is a short extinction length. 
For a y of 0.74 eV the extinction length is about 10" 5 cm. Following the inter¬ 
pretation of the width y, this length must be interpreted as the result of super¬ 
imposing a large number of bands each of which has a much longer extinction 
length. The exciton is therefore limited in extent and is created at the surface 
since the photon cannot penetrate the material to produce excitons in the 
interior. This finite size of the exciton explains why the long range terms 
vanish from the excitation energy leaving only the local effects and the spheri¬ 
cal sum. In principle the exciton could have infinite extent in planes normal to 
the wave vector but in practice this is limited by the coherence of the external 
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radiation which creates the exciton. The coherence diameter (Born and Wolf, 
1959) will usually include many molecules but is small in comparison with the 
specimen sizes. Thus the exciton can be pictured as finite in extent and so with 
an energy independent of specimen size or shape. 

Another aspect of this attenuation is that the movement of these excitons 
is so very heavily damped that they cannot contribute to the transport of 
energy. Furthermore, an elaborate investigation would be needed to determine 
how excitons in a region of anomalous dispersion can carry energy since 
even a classical discussion (Stratton, 1941) shows that neither the phase 
velocity nor the group velocity is appropriate though possibly the signal 
velocity could be used. It becomes necessary, therefore, to find alternative 
explanations of the phenomena usually attributed to the movement of the 
exciton. 

One of the most important of these observations is the dependence of the 
fluorescence spectrum on the concentration of impurities (Bowen et ah, 1949; 
Northrop and Simpson, 1956). This is easily explained in terms of the chance 
of an impurity lying inside the finite size of the exciton and the chance of the 
impurity trapping the energy by accepting the exciton’s energy and losing 
part of it irreversibly to the lattice. 

The other major consideration involved in the discussion of mobility is of 
the fate of the original exciton. Since the lifetime is very much shorter than 
the natural lifetime of a band it is clear that the exciton is not decaying by 
emission of a photon but is being transformed internally by a phonon col¬ 
lision. The result of such a collision will usually be to leave an exciton having 
approximately the same energy but a different crystal momentum. Such an 
exciton is no longer in the region of anomalous dispersion and so has more 
normal properties. It is these excitons with their much longer lifetime and no 
attenuation which will contribute to collision processes in the body of the 
crystal. Thus, the original problem of the discrepancy between the properties 
of the excitons as assumed in describing different phenomena is resolved by 
distinguishing sharply between the originally formed excitons which are 
finite in extent and immobile, and the excitons to which these decay which 
are not extinguished and will be involved in collisions. 
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I. Introduction 

Some years ago (Hurley, 1951, Part I)* the author used Goursat’s (1889) 
analysis of finite rotation groups in four dimensions ([4]) to enumerate the 
four-dimensional crystal classes. At that time no physical application of these 
groups was envisaged. Since then, however, generalized groups of symmetry 
in [3] introduced by Shubnikov (1964) and others have found extensive 
applications in crystallography (Niggli, 1964). One method for obtaining these 
generalized symmetry groups is to project the ordinary symmetry groups from 
a space of higher dimensionality. Indeed Shubnikov’s original derivation of 
the polar, gray and black-white groups of finite plane figures was by projection 
from the point groups in [3]. 

When this projection method was applied to the [4] crystal classes listed 
in Part I it was found that not all of the known polar, gray and black-white 
groups in [3] were obtained. This indicates that the list given in Part I is 
incomplete, as has been suggested by others (Niggli, 1964; Wondratschek 
and Neubiiser, 1965). The enumeration of the crystal classes in [4] has been 
repeated using the same methods as before. Two errors, which led to the 
omission of several classes, were detected as well as several misprints. Now, 
227 crystal classes are found in [4] (previously 222) or which 45 are irreducible 

* Hereafter referred to as Part I. 
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(previously 45). Applying the projection method to this revised list we obtain 
32 polar, 32 gray, and 58 black-white groups in [3] in agreement with results 
by other methods (Niggli, 1964). 

II. Revised Tables of Crystal Classes in [4] 

The revised list of crystal classes in [4] is given in Tables la, lb, 2a, and 2b. 
As in Part I each crystal class is specified both in Goursat’s notation and by the 
number of elements with given values of the invariants x, <r and d. These 
invariants are the coefficients in the characteristic equation 

det (21 - A) = A 4 — x(A)2 3 + a (A)A 2 - x(A)A + d( A) = 0 (1) 

for the 4 x 4 orthogonal matrix A. 

The sets of values of (x, cr, d) which occur are labelled as follows: 

I = (4, 6, 1); 

A =(0,0,1), B = (0,1,1), C = (0, -1,1), D = (0,2,1), 

E=(0, -2,1), F = (0, 0, — 1); 

(2) 

K = (1,0,1), L = (1,1,1), M = (1,2,1), N = (1, 0, — 1); 

R = (2,2,1), S = (2,3,1), T = (2, 0, — 1); 

Z = (3, 4,1); 

together with I' (=-I), K', L', M', N'; R', S',T'; Z' formed from the 
above by changing the sign of the trace x- 

For the crystal classes containing —I (Tables la, lb) we tabulate half the 
number of types A, B, — F and omit the dashed types; an asterisk (*) denotes 
an irreducible class. Entries which differ from those of Part I are indicated 
by boldface type or in footnotes to the tables. 

The three groups X.w = 3, n = 2; XIII '.m = 4,n = 6 and XlV.m = 3 shown 
in Table la' are of some interest. Although these groups contain only elements 
with integral invariants, it was shown in Part I that they are not crystal 
classes. Thus, although any element of one of these groups appears as an 
integral 4x4 matrix in a suitable coordinate system, it is impossible to 
choose a coordinate system such that all the elements of one of these groups 
appear in integral form simultaneously. Such groups do not exist for a space 
of dimensionality less than 4. These groups may be considered as pseudo 
crystal classes. If these three groups are included with the 227 crystal classes 
in [4], a total of 230 groups is obtained, equal in number to the space groups 
in [3]. This seems to be just a numerical coincidence. 
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In all 50 crystal classes 17 of which are irreducible. Projection gives 11 gray and 10 black-white groups in [3]. 
In Part I this group was listed again as Goursat’s XXXVII. m = /x = v = 1. The two groups are equivalent. 
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TABLE 2a 

Proper Crystal Classes in [4] Not Containing —1“ 
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° In all 33 crystal classes, 2 of which are irreducible. Projection gives 11 polar and 10 
black-white groups in [3]. 
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a In all 71 crystal classes 6 of which are irreducible. Projection gives 21 polar, 21 gray and 27 black-white groups in [3]. From Tables la,b 
and 2a,b we have a grand total of 227 crystal classes 45 of which are irreducible. Projection gives a total of 32 polar, 32 gray and 58 black- 
white groups in [3] in agreement with enumerations by other methods (Niggli 1964). 
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IQ. Projected Groups 

The [4] crystal classes which are reducible to the form 



with X a set of 3 x 3 integral matrixes, yield generalized symmetries in [3] 
when projected along the fourth coordinate (Shubnikov, 1964; Niggli, 1964). 
The following three cases arise: 

(1) Only positive signs occur in the reduced form (3). Projection leads to a 
polar group in [3]. The set X forms one of the ordinary crystal classes in [3]. 
The elements which occur in X are, in international notation: 

tv. rotation through angle 2 n/n (n = 1, 2, 3, 4, 6) 
n: rotatory-inversion through angle In/ti (n = 1, 2, 3, 4, 6), 2 = m. 

(2) Each three-dimensional symmetry element (n or n) occurs twice in the 
reduced form (3), once associated with +1 and once with — 1. Projection gives 
a gray group in [3] in which each symmetry element occurs both in uncolored 
form (n or n ) and in colored form (ri or /T). A gray group is denoted by adding 
T to the symbol for the ordinary [3] crystal class, e.g. 43ml'. 

(3) The elements (say Y) associated with +1 in the reduced form (3) from 
a subgroup of index 2 of the whole set X. Both X and Y are ordinary [3] 
crystal classes. On projection one obtains a black-white group in which the 
symmetry elements of Y are uncolored and the remaining symmetry elements 
of X are colored (i.e., dashed). Following Niggli (1964) we refer to Y as the 
kernel of the black-white group X and write Y in parenthesis after the inter¬ 
national symbol for X, e.g., 32'(3). 

The invariants x, <r, and d of elements in the reduced form (3) are calculated 
using the diagonal forms of the [3] symmetry elements: 

n = diag (e„, e*, 1), n = diag (- e„, - e*, -1), 

where 

e„ = e 2ni,n . 

Comparing the results of this calculation with the list (2), we obtain Table 3 
which shows the [3] symmetry elements which can result from projection of a 
given [4] symmetry element. 

From Table 3 we deduce: 

(a) No crystal class in [4] containing any element A, B, C, D, L, M, S, L', 
M , S can be reduced to the form (3) appropriate for projection. This is 
immediately clear from the characteristic equation (1), since neither + 1 nor 
— 1 is a root of this equation for these elements. 
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TABLE 3 


Projection in [3] of Symmetry Elements in [4] 


1^1 

E-s-2,2' 
F~>4,4' 
K -> 3 


N-*3,6' 

R->4 

t-> 2 ,r 

Z —> 6 



A, B, C, D, L, M, S, L', M', S' cannot be projected to [3]. 


(b) A gray group in [3], which always contains the symmetry element 1', 
can only arise by projection from a crystal class in [4] which contains at least 
one element of type T. 

(c) Since the element I' (= —I) in [4] always gives I' in [3], we have the 
following relations for the [3] groups projected from a [4] class of Table la 
and lb. 


N(n) = N(h’) N(n') = N(n), n = 1, 2, 3, 4, 6 


Here N (•••) denotes the number of symmetry elements of the indicated 
type. 

The projected groups given in the final column of Table la follow immedi¬ 
ately from the results (a) and (c); each entry in Table la which does not contain 
A, B, C, D, L, M, or S yields just one black-white group. The kernel H of this 
group G is a proper crystal class in [3], and the remaining elements of G are 
obtained from H on multiplication by the symmetry element I' (the colored 
inversion). 

The derivation of the projected groups from the [4] crystal classes of Tables 
lb, 2a and 2b is not quite so simple. We see from Table 3 that a given 
symmetry type in [4] may lead on projection to two distinct symmetry types 
in [3]. For this reason, a crystal class in [4] may yield several distinct groups 
on projection. The complications arising from this situation may be analyzed 
in terms of the invariant sumsg _1 Ex 2 andg -1 I<x, where the summations 
are over all elements of a [4] crystal class D 4 or order g. From the values of 
these sums inferences may be drawn concerning the reducibility of D 4 
considered as a representation of itself. These inferences, which are shown in 
Table 4, follow from the character and reality conditions for irreducible 
representations (Wigner, 1959) and the relation 


X(A 2 ) = x 2 (A) - 2a(A), 


( 4 ) 


which is a direct consequence of the definition of cr (Eq. (1)). 

In Table 4, T n denotes a real irreducible representation of dimension n, 
(r„ + T*) denotes a pair of complex conjugate representations and dashes 
are used to distinguish inequivalent representations of the same dimension. 
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The maximum number of [3] groups obtained on projection, P max , is given by 
the number of inequivalent, real, one-dimensional representations contained 
in D 4 . 


TABLE 4 


Reduced Forms of a [4] Crystal Class D 4 


rs 

X 

W 

i 

g l 2<r 

Inference 

P 

± max 

1 


d* = r* 

0 


0 

/d 4 = r 3 + i\ 

1 

2 

\d 4 = r 2 + IT 

0 

3 

0 

D 4 = Tj + Tj + rj 

2 

3 

1 

D* = r 2 + (Ti + r?) 

0 

4 

0 

d 4 = T! + r; + T" + r;" 

4 

A 

1 

/ d 4 = 2 r 2 

0 

4 

\d 4 = (i\ + rf) + rj + n 

2 

4 

2 

d 4 = (r, + rf) + (Ti + ri*) 

0 

5 


D 4 T2 + 2Tj 

1 

6 

1 

D 4 = 2ri + rj + r? 

3 

6 

2 

d 4 = 2 r i + (n + r;*) 

1 

8 

2 

D 4 = 2r! + 2rj 

2 

8 

4 

D 4 = 2(T, + T*) 

0 

10 


D 4 = 3T t + T{ 

2 

16 


D 4 = 4T t 

1 


We see from Table 4 that, except for the two cases (2, 0) and (4, 1) the values 
of the invariant sums specify the reduced form of D 4 , and hence P max , uniquely. 
However, even in cases where P max is uniquely specified, the actual number of 
distinct [3] groups obtained by projection may fall short of P max . Consider, 
for example, the [4] crystal class 

I + E + 2T (5) 

appearing in Table 2b. Here g -1 Ix 2 = 6, g~ i lo=\ so that, from 
Table 4, D 4 = 2T, + Tj + T,, P max = 3. From the expressions given in 
Part I and Goursat (1889) the explicit form of the matrixes of the group (5) are 
as follows: 


I = diag(1, 1, 1, 1); E = diag(l, 1, -1, -1); 

( 6 ) 

T = diag(1, 1,-1, 1); T = diag(l, 1, 1, -1). 

Equation (6) shows four real one-dimensional representations, two of 
which are equivalent in agreement with Table 4. However, upon projection 
along each of the coordinate axes we obtain the [3] groups (all matrixes 
diagonal). 
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(1,1,1), 

(1, -1, 

-1), 

(i, -i,iy 

(1,1, — 1) i.e., mm2 

(7) 

(1,1,1), 

(1, -1, 

-1), 

(i, -i, iy 

(1, 1, — 1) i.e., mm2 

(8) 

(1,1,1), 

(1,1, - 

o', 

(i, i, iy, 

(1, 1, — 1) i.e., ml' 

(9) 

(1, 1, 1), 

(1,1, - 

iy, 

(i, i, -iy 

(1,1,1)' i.e., ml' 

(10) 


The identity of the projected groups (7) and (8) is, of course, to be expected 
since they are obtained by projecting equivalent representations. However, 
the projected groups (9) and (10) are also identical, the inequivalence of the 
projected representations being reflected merely in a permutation of the 
symmetry elements. 

Fortunately, nearly all the cases in which the actual number of projected 
groups falls short of P max are equally simple, since they involve fully reducible 
groups, which may be considered in diagonal form. 

Most of the [3] projected groups shown in the final columns of Tables lb, 
2a and 2b may be written down immediately using the results of Tables 3 and 
4. The remaining [3] groups were obtained from the explicit forms of the 
elements of the [4] crystal classes (Part I, Eq. (5) and Goursat (1889)). The 
final list of 32 polar, 32 gray and 58 black-white groups agrees with that 
obtained by other methods (Niggli, 1964); all the [4] crystal classes omitted 
from the lists of Part I (1 from Table 2a, 5 from table 2b) are checked by the 
projection of [3] groups. 


Note added in proof. An independent derivation of the [4] crystal classes 
has been completed by H. Wondratschek and J. Neubiiser (private communi¬ 
cation by J. Neubiiser). Their derivation starts from the four maximal crystal 
groups obtained by Hermann (1949, 1951). These maximal groups are entries 
XLV; LI; XXXVII, m=p=v= 3; XXXV, p = 1 , v = 1 , D = 12 of Table lb. 
Wondratschek and Neubiiser have used a computer to find all subgroups of 
these four groups, eliminating equivalent groups at each step. Their final 
results agree exactly with the 227 crystal classes given here. Since Hermann 
makes no use of Goursat’s (1889) work the two derivations are completely 
independent and their agreement provides a most valuable check on the 
accuracy and completeness of Tables la, lb, 2a and 2b. A preliminary account 
of the two calculations has been submitted for publication in Acta Crystal - 
lographica. 
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I. Introduction 

Appreciable progress has been made in the study of noise in classical and 
quantum systems in the Markoffian limit. For recent progress and a summary 
of the literature see the author’s series of papers (Lax, QI-QV). For slightly 
non-Markoffian systems, in other words, systems in interaction with a reservoir 
whose correlation times are short but not zero, there has also been recent 
progress. See, for example, Lax (QIII) and Argyres (1963). However, aside 
from the quasi-static approximation, there has been no practical success with 
the long correlation time situation, the case in which the reservoir correlation 
times are comparable to the relaxation times in the system. 

A useful way in which to study the long correlation time problem would be 
to find an example for which exact solutions are available and to study the 
nature of these solutions in various limiting cases. Louisell and Walker (1965) 
appear to provide us with such an example, a system harmonic oscillator 

* Present address: Physics Department, Massachusetts Institute of Technology, Cam¬ 
bridge, Massachusetts. 
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interacting harmonically with a reservoir consisting of a continuum of indepen¬ 
dent harmonic oscillators. Louisell and Walker calculate the density matrix of 
the system only for an initial condition of displaced Gaussian form. They then 
show that the future of the system continues to have a displaced Gaussian 
form determined by the mean motion of the system harmonic oscillator and 
the energy of the system harmonic oscillator as a function of time. The equa¬ 
tion for the mean motion of such a harmonic oscillator is easy to write down 
and to solve exactly. While the equation for the mean excitation of such a 
harmonic oscillator can also be written down fairly simply, it is not an equa¬ 
tion that can be solved directly in any simple way. Louisell and Walker 
demonstrate that in the appropriate short correlation time limit the results 
typical of Markoffian systems (see Lax, QIV) are obeyed. 

In this paper we shall extend the Louisell-Walker work in two ways: (1) 
we shall obtain a solution for the future of the density matrix for the case in 
which the system is initially in a definite state. (2) We shall be concerned not 
so much with approximate solutions for the equations obeyed by the mean 
excitation, but rather with exact expressions for this mean excitation either 
at a general time or as t approaches infinity. Our principal contribution of the 
first kind is an exact expression for P(mt\nO), the probability of a system 
oscillator being found in state m at time t if it was initially in state n at time 
0. Our principal contribution of the second kind is a proof that if the reservoir 
possesses no density of states at the perturbed frequency Q 0 of the system, 
the system does not come to equilibrium. We shall interpret this conclusion 
to mean that in such cases a transport equation does not exist, and if state n 
cannot be reached from state m in first-order perturbation theory (in other 
words using one reservoir phonon) then no time proportional transitions 
occur from state n to state m using an arbitrary number of reservoir phonons. 
This conclusion is not entirely a trivial one, since matrix elements for multi¬ 
phonon processes exist. What happens, although we have not the space to 
demonstrate it here, is that such multiphonon processes can happen in many 
ways and the coherent sum of all transition amplitudes add to zero. See 
Fig. 1 for an example of such a multiphonon process. 

II. The Reduced Density Matrix 

To shorten the length of our equations we have adopted the Louisell- 
Walker Hamiltonian in the rotating wave approximation 

N 

je = co 0 a f a + £ [cojbtbj + K^bj + tfab}] + e(t) a 1 + e*(t)a (1) 
j= 1 

This approximation is discussed in QIV, especially Appendix C. It should not 
have any qualitative effects on our conclusions. The variables a and a * are 
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Fig. 1 . We display the ladder of system harmonic oscillator states, in the case in which 
the system frequency co 0 exceeds the maximum reservoir frequency co M . A multiphonon 
transition from the first excited state to the ground state can, in principle, take place by the 
path shown. If each step involves the emission of large reservoir phonons, or the absorption 
of small reservoir phonons, energy conservation is possible after enough steps. Nevertheless, 
we show in the text, that such transitions do not occur. 

the usual destruction and creation operators of our system harmonic oscillator. 
The variables b jf b) are corresponding destruction and creation operators for 
reservoir oscillator j. The c-number function e(t ) is an external driving force 
that acts on the system oscillator. We have adoped units in which h = 1. 
The Heisenberg equations for the mean motion of these oscillators in the 
presence of the driving force can be written down easily and solved. A 
canonical transformation can then be performed in which each oscillator is 
shifted by its mean motion. The new Hamiltonian then has the same form as 
Eq. (1) with the driving terms omitted. We shall therefore omit the driving 
forces from Eq. (1) and assume that the variables a, bj are already taken 
relative to their mean values. 

The Louisell-Walker starting density matrix for the system can then be 
written in the form 

<7,(0) = (1 - e~ x ) exp (-A***) = [»(0)] ot 7[«(0) + \] aU+1 (2) 

where - 

A = ha) 0 /kT s , (3) 

and 

n(0) = <a f (0) u(0)) = (exp A - I)" 1 = «(A, 0) (4) 

represents the mean excitation number in the initial state. Louisell and Walker 
choose the density matrix of the reservoir at the initial time to be in a Gaussian 
state so as to maximize the entropy. Since the equations of motion are linear 
the density matrix of system plus reservoir must remain a Gaussian for all 
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time. Thus the complete density matrix of system plus reservoir is completely 
characterized by a knowledge of all the second moments as a function of time. 
These second moments obey equations discussed in some detail in Section V. 
In any case the density matrix of the system itself remains in Gaussian form 
and can therefore be written 

<’>(» = wor , 7["W + i]‘ , " +1 , (5) 

where 

m = Wit) a(t)} = n(X, t) =f(t) n(X, 0) +*(/) (6) 

is the mean excitation of the system oscillator as a function of time. The 
positive numerical functions of time f(t) and g{t) are discussed in detail in 
Sections V-VII. Their initial values are obviously given by 

/(0) = 1, g( 0) = 0. (7) 

The occupancy of the state with excitation number m is given by the particular 
matrix element 

G x(t)mm = \.f n {X, 0) +g] m j[f n(X, 0) + g + l] m+1 . (8) 

III. Density Matrix for a Given Initial State 

Our initial condition 

cr (n) (0) = <5 (a*a, n ) 

= (1/2*) f 2 ' df) 

= (1/2*) //” e M [<r,(0)/(l - (•-')]„„ M (9) 

can be written as a linear superposition of Gaussian density matrixes. We 
can therefore write our general solution at time t as the corresponding linear 
superposition of the Louisell-Walker solutions: 

a (,,) (0 = l/2a: J o 2 * e ine [<T ie 0)/(l - e~ , ' e )] d0. (10) 

The transition probability from initial state n to m is therefore given by 
P(mt\n0) = a^\t) mm 

= j_ r 2 n e u,e d6 [fnjW, 0 ) + g] m , 

2n J o l -*>-■'* [f n (W, 0) + g + l] m+1 ’ ^ 

where 


n(i9, 0) = [exp(/0) - 1] 1 . 


(12) 
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Using the abbreviation u = exp( — id) we can rewrite (11) in the form 




[g + w- g ) «r 


[l+g + if-g- 1)«] 


m + 1 


The further transformation 

« = u(l + g)/(l + g-f) 
leads to an even simpler form 


n/ „ ™ a +g-/r~ m (f-g) m „ 

P(m/|nO) =- — - x „+ i - H 


(1 +g) 

1 j; dv (z + v) m 


/ *U+*-/)\ 

Tl, m, 77 - ”7 -- 7 

\ 0+ £)(/-£)/ 


= coefficient of v" in (z + u) m (1 — 

— z m F(—m, —n; —m—n; — 1 /z)(m + n)\/[m\n\] 
where F(a, b; c; z) is the hypergeometric function. 


(13) 

(14) 


(15) 


(16) 


A. Characteristic function 

Using the abbreviation w = e ia , we can write the characteristic function in 
the form 

< e ,wU> s = Yw m P(mt\nO). (17) 


Using Eq. (15) and performing the sum, we obtain as our characteristic 
function 


<e'“ ato > = <c ,am > = 


[1 + (1 - w)(g -/)]" 
[1 +(1 - w)g ]" +1 


(18) 


From this characteristic function we immediately obtain the first and second 
moments 

<m> = -id{e iam y/d aUo =f(t)n + g (19) 

<m 2 > - <m> 2 = (fn + g)(l + fn + g) -f 2 n( 1 + n ). (20) 

By comparing Eqs. (17) and (18) we obtain a new expression for the transition 
probability: 

P(mt\n0 ) = coefficient of w m in [1 + (1 — w)(g —f)] n /[l + (1 - H>)g]" +1 . 

(21) 
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B. Special cases 

Using Eqs. (16) and (21) we shall write down a number of special cases 


for the transition probability: 

P(mt\oO) = g m /(l + g) m+1 (22) 

P(mt\ 10) = [mf+g( 1 +g -/)k m "70 + g) m+1 (23) 

P{ot\n0) = (1 + g -f) 7(1 + g) n+1 (24) 

P(lt\n0) = [g(l +g-f) + nf](l+g-f) n - 1 /(l +g) n + 2 (25) 

P(ot\o0) = (1 + g) -1 (26) 

P(lt\o0) = g/(l + g) 2 (27) 

P(ot\l0) = {l + g + g) 2 (28) 

P(U|10) = [g( 1 + g -/) +/]/(1 + g) 2 . (29) 

We shall see later that when the reservoir temperature is at absolute zero 
the function g can be set equal to zero. In this case the transition probability 
vanishes for m > n and reduces for m < n to 

P(mt\n0) =f m (1 —f) n ~ m — -t— —— if g = 0. (30) 

m \(n — m )! 


IV. Approach to Equilibrium 

In the Markoffian limit we have shown in Lax (QIV, 3.8) that the mean 
excitation of the harmonic oscillator obeys 

d(a^a}/dt = yn — y^cda}, (31) 

which has the solution 

(a\t) a(t )> = V(0M0)> + n(l - e~"), (32) 

so that 

M = e-v<; g(t) = n( 1 - e~"). (33) 

These results have also been found by Louisell and Walker after the use of a 
Weisskopf-Wigner approximation. We note the particular result 

/(CO) = 0, (34) 

which tells us that the mean excitation of the oscillator eventually forgets 
what the initial excitation does. We expect therefore that /( oo) = 0 is the 
general condition for approach to equilibrium. We note that in this limit 

z = g(l+g -/)/[(! + g)(f- g)] - 1 (35) 
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so that 


H(n,m, -l) = (-l) m , (36) 

from which we obtain the transition probability at infinite time in the form 

P(mco\nO) = [g(oo)] m /[l + g(oo)] m+ \ (37) 

a result which is independent of the starting state n. Equation (37) describes 
the usual “Gaussian” equilibrium state appropriate to a mean quantum 
number 


n = g( oo), (38) 

independent of the initial excitation. 

V. Equations of Motion for the Second Moments 

For simplicity of notation we treat the system oscillator as oscillator zero 
among the complete set of oscillators and thus write a Hamiltonian in the 
form of 


= I b}b s 3f sr ; b 0 = a, bl= a* (39) 

^00 = «0 > ^jo = K j > Oj = k* ( 40 ) 

ij = 0)$^ i,j 0. (41) 

The Heisenberg equation of motion for the set of second moments is then 
given in the form 

idbjbj/dt = Y [b}b s Nf sj - jtT is btbj] (42) 

or 

idy-Jdt = [y,3#’] ij all i,j, (43) 

where 

y,j(0 3 <*?('>*/<)>; roo(0 = W(l)a(t)} = n(t), (44) 

>>o<0) = <*!(0)<>/0)> = <>,/!,. (45) 

The n t represent the initial excitations of the oscillators i. We also obtain the 
second-moment equations 

idb t bjld a = £ [bh^si + bJ>jje si ], (46) 

S 

from which we can conclude 

<bi(t)bj(t)y = 0 if <6 ; (0)6/0)> = 0. 


(47) 


594 


HUNG CHENG AND MELVIN LAX 


Thus we need only concern ourselves with the set of equations (43). We can 


treat these equations by making the Laplace transformation 

Yijip) = e ~*%•(') dL ( 48 ) 

This leads us to the set of algebraic equations 

P Yooip) = «o + ' I W*Y J0 {p) - Y 0J (p)Kj] (49) 

3* o 

(ico 0J + p)Y J0 (p) = iK.Y 00 (p) - i £ Y Jk ip)K k (50) 

k± 0 

(ia>ji + p)Y^ip) = i[K[Y 0J (p) - Y i0 {p)K*j] + nfiij . (51) 

We then introduce the definition 

i E IkjYjo - Y oi k, 1 s - S(f)Y 00 (p) + R(p), (52) 

j*o 


from which we can solve for the motion of our special oscillator in the form 


Yooif) = 


+ 


Rip) 


p + S{p) p + Sip) ’ 


(53) 


which leads to a time-dependent excitation for that oscillator in the form 


nit) =fit)n 0 + g(t), 


(54) 


where the time-dependent functions fit) and git) are given by the inverse 
Laplace transforms 


w-s?/ 


oP* 


dp 




) = 2ni) ePtF{p)dp 

(55) 

e pt Rip) dp 


P + Sip) ‘ 

(56) 


In investigating whether or not equilibrium is approached, we shall be prim¬ 
arily concerned with the limiting values (that omit oscillatory terms): 


lim pFip) = lim p P° e pt fit) dt =/( oo) 

p-*- 0+ p->0 + 


/(oo) = lim 

p-*0 + 

g(oo) = lim 

p~* 0 + 


P 

P + Sip) 

p^jp) 

P + Sip) 


(57) 

(58) 

(59) 


In order to obtain Sip) and Rip), we must solve Eqs. (50) and (51). To obtain 
S we solve these equations first setting n t = 0. To obtain R we must solve 
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these equations retaining the but setting Y 00 = 0. Since the equations of 
motion are linear the results are necessarily expressible in the form of Eq. (52). 


VI. Lowest Order Perturbation Theory Results 


We can obtain a qualitative understanding of the nature of our results by 
solving for them in the lowest order of perturbation theory. To obtain S(p ) 
we set Y jk = 0 in Eq. (50) and iterate on Eqs. (50) and (51) to obtain our 
result in the form of 


S(p) = I 

j 


\Kj\ 2 2p 

0 Uoj) 2 +P 2 



p(co) dco 

(co 0 - co) 2 + P 2 ’ 


(60) 


where the density function p(w) is defined by 


p(o>) = Z I K ;l 2<5 (" ~ <°j)- ( 61 ) 

j 

In the limit, as the number of reservoir oscillators approaches infinity, this 
density function approaches a continuous function. To obtain R(p), we set 
Y 0 o = 0 and start with 


yijit) » yij(0) = n t 8 u ; Yijip) = (njp)8 

One iteration then leads to 

\Kj\ 2 nj ^ f p(co)n(co) dco 


u • 


R(p) = 2£ 


= 2 C_W 

J (co n — 


' r(co 0 j) 2 +P 2 J(co 0 -co) 2 +p 2 ’ 

where we have set nj = n(coj), a continuous function of co. 


(62) 


(63) 


A. Case 1. p(co 0 ) # 0 

In this case we can take the limit p->0 + to obtain the asymptotic results 


S(0 + ) = 2np(co 0 ) = y (64) 

lim pR(p) = 2nn(co 0 )p(co 0 ) (65) 

/(co) = 0 (66) 

g(oo) = n(co 0 ) = n. (67) 


We see from Eq. (66) that if the density of states does not vanish at the fre¬ 
quency co 0 of our special oscillator equilibrium is approached. We see from 
Eq. (67) that the final excitation number of our system oscillator is determined 
by the excitation number of the reservoir oscillators at the frequency of the 
system oscillator. 
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If we assume that the behavior of our S{p) and R(p ) functions can be 
extended from small p to all p in the form 

S(p) « y; R(p )« yn/p, (68) 

then we immediately obtain results in agreement with the Markoffian proce¬ 
dure of Eqs. (31-34). 

B. Case 2. p(co 0 ) = 0 

In this case we can also obtain the limiting values 


lim [S(p)/p] = 2 f p(co) dco/(a > 0 - co) z 

p-0 + J 

(69) 

R( 0 + ) = 2 J p(cD)n(co) dco/(a> 0 — co) 2 

(70) 

/(oo) = [l -H 2 J p(co) dco/(a> 0 - co) 2 j 1 > 0 

(71) 

g(co) = R(0 + )f(co) > 0. 

(72) 


We see from Eq. (71) that when the density of states p(co 0 ) vanishes at our 
oscillator frequency a> 0 the system never approaches equilibrium. This is 
basically a consequence of the function S(p) vanishing linearly in the neigh¬ 
borhood of p = 0. 

In this section, we have shown that, to lowest order perturbation theory, 
equilibrium is attained if p(co 0 ) > 0, and is not attained if p(co 0 ) = 0. We 
shall show in the next section that an exact calculation replaces these by the 
corresponding conditions p(Q 0 ) > 0 and p(Q 0 ) = 0 where is the exact 
perturbed frequency of the oscillator. These conditions are closely related: 
if co 0 is outside the continuous spectrum, we would expect a turn-on of the 
coupling between system oscillator and bath to introduce repulsions that 
would prevent fro™ entering the continuum. If p(co 0 ) > 0, however, suffi¬ 
ciently strong coupling may push Q 0 out. 


VII. Exact Expressions for the Mean Oscillator Excitation 

Perhaps the simplest way to obtain exact expressions for the mean motion 
of our system oscillator is to transform to the new normal coordinates Bj and 
then transform back again. These normal coordinates can be assumed to 
obey 

Bj(t) = Bj( 0) exp(iTV), (73) 

where Bj ~ bj and the frequency Q, ~ coj. The transformation to the new 
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variables is given as a unitary transformation 

*| = E U„Bj. (74) 

The inverse transformation can therefore be written as 

Bj=Y,(U-'h A-W,. (75) 

The mean excitation of our system oscillator can therefore be given in the form 
<bl(t)b 0 (t)} = £ UqjUqj exp [i(Qj - O,)/] (BjBj) 

= E UoiUoj exp m, - Q,)t] U rI U*(b}b s >, (76) 
into which we can insert the initial condition 

< blb s }=n r 8 rs . (77) 

Thus we obtain the exact expressions 

fit) = E \ u oi\ 2 \U OJ \ 2 exp [i(Q, - Qj)t] (78) 

I t J 

g(t) = E u *oiU QJ exp [i(Q, - O,)?] £ U rl U* rJ n r . (79) 

I,J r=£0 

If n r = n is independent of r, we obtain a simple relation between g and /: 

g(t) = n[ 1 -/(/)]. (80) 

The Laplace transform of f(t ) is given by 

rn = E \Uoi\ 2 \Uoj\ 2 pI[p 2 + (a, - n,) 2 ]. (81) 

i,j 

Our limiting value then takes the form 

/(oo) = Km pF{p) = £ I Uoif- (82) 

p-i-0 / 

To get a limiting behavior independent of N as N -*■ oo we must have 

Jp(co) dco = Y |k,-| 2 = independent of iV, (83) 

in other words 

Kj = 0(N~ 1/2 ) (84) 

and 

U 0I = O(N~ 1/2 ); /#0. (85) 

Thus in the limit as the number of reservoir oscillators goes to infinity we 
obtain 

lim/(oo)= lim \U o0 \ 4 . 


( 86 ) 
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To obtain the normal mode transformation, we take all of the linear 
equations of motion for the destruction operators and assume that all 
operators have the same time dependence exp( — iClt). From this we learn 
that the relative amplitudes must obey 

b t = [Kf/(ft - c od]a. (87) 

For mode J, we must replace ft by ftj. We are primarily interested in the mode 
0 which is that mode that resembles most closely our system oscillator. With 
this mode the elements of our unitary transformation must obey 

Uto^lKt/iClo-aidWoO. ( 88 ) 

The amplitude U 0 o is determined by normalization to take the value 


u 0 o = 


i + X k-| 2 /(fto - w .) 2 

i 


- 1/2 


(89) 


Making use of Eqs. (61) and (86) we can therefore write our asymptotic 
value for / in the form 


/too) = 


r p(oj)doj 
J (fl 0 - CO ) 2 


-2 


(90) 


As long as the perturbed frequency ft 0 is outside the range of the spectrum 
of the reservoir oscillators the integral in Eq. (90) converges and f(oo) does 
not equal zero. Note moreover, that in the weak coupling limit the expression 
(90) reduces to our lowest order result, Eq. (71). 

The equation obeyed by the special oscillator variable a leads to the 
relation 

ft - co 0 = X ImI 2 /(^ - cod -* JP(<*>) dco/(Q - co) (91) 

that determines all of the normal frequencies of the perturbed system. The 
root ft 0 is the root closest to the value <x> 0 of the unperturbed isolated system 
mode. In a similar way we obtain the exact limiting expression 


g(oo) = Xlt/o/l 2 X l^r/lV (92) 

I r* 0 

If we introduce 

»,= E \U„W/Z\U r ,f (93) 

r*0 1 r *0 


as an appropriate average of reservoir excitations, our limiting value takes 
the form of 


g(oo) = <«> - n 0 f( oo), 


(94) 
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where <n) is defined by 




(95) 


Equation (86) assumes that only one mode exists outside the continuous 
spectrum. If there are several, with frequencies Q L , Eq. (90) is replaced by 



(96) 


so that, a fortieri,/(co) > 0, and equilibrium is not reached when one or 
more isolated (perturbed) modes exist. 

We interpret the failure of the system to approach equilibrium as inability 
of the system to perform the multiphonon transitions that are needed when the 
system frequency is outside that of the continuum. Equation (86) essentially 
states that the system energy is given to the perturbed mode which resembles 
most closely the system mode. A subsequent measurement of the system 
energy simply measures the extent to which the perturbed mode makes its 
energy available to the unperturbed system mode. The reduction of /( oo) 
from unity is thus a measure of this overlap effect and not evidence for the 
existence of any transition processes. 

Conversely, if all perturbed frequencies Qj are inside the continuous spec¬ 
trum, Eq. (85) is valid for all 7, and it follows from Eq. (82) that /(oo) -»• 0 as 
N -*■ oo. Alternatively, if no isolated Cl L exists, there are no terms in Eq. (96). 
Thus if no isolated modes are created, /(oo) = 0 and equilibrium is approached. 

An interesting borderline case can occur if p(co 0 ) # 0 but p(Q.j ) = 0 for 
some 7. In this case /( oo) > 0 and complete equilibrium is not approached. 
But if the mode 7 is very unlike the unperturbed system mode, /(oo) will be 
quite small, and a near equilibrium may result. 

Note added in Proof. After the submission of this paper, Professor N. G. 
Van Kampen kindly informed us that in a doctoral thesis by Pieter Ullersma 
of University of Zurich this problem has also been treated. 
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I. Introduction 

In the quantum theory of matter, there are two fundamental ideas which 
are closely associated with discoveries by Professor John C. Slater, namely the 
Slater determinants and the Hartree-Fock method. Our physical understand¬ 
ing of the structure of atoms, molecules, solids, and atomic nuclei is essen¬ 
tially based on the independent-particle model (7), and these are the funda¬ 
mental mathematical and conceptual tools for dealing with this model. The 
self-consistent-field (SCF) scheme was first developed intuitively by Hartree(2) 
for atoms. In a study of complex atomic spectra, Slater (5) introduced the now 
famous determinants which carry his name, and in a subsequent paper (4), 
he showed that Hartree’s method could be justified and extended by 
treating Schrodinger’s many-electron wave equation by the variation principle 
for a determinantal wave function. Together with a paper published about 
simultaneously and independently by Fock (5), this forms the basis for the 
so-called Hartree-Fock method. Few discoveries have been of such importance 
for the application of quantum mechanics to the study of the properties of 
matter. 

Slater’s interest in the independent-particle model has continued through 
the years, and, in a series of papers, he has studied in particular the pheno¬ 
mena of “exchange” and “correlation” (see references). In treating the 
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symmetry properties of various systems, Slater also noted some peculiarities 
in the Hartree-Fock method that form the starting point of this paper, which 
is dedicated to him. 

II. Symmetry Dilemma in the Hartree-Fock Method 

In Hartree’s original calculations of the electronic structure of atoms (2), it 
was always assumed that the electronic orbitals would be symmetry-adapted 
and of s, p, d, f, etc., character, and that, in the calculations, the self-consis- 
tent-field potentials should be replaced by their spherically symmetric part. 
This scheme was essentially refined when Slater and Fock suggested that the 
total wave function V F should be approximated by a single Slater determinant 
D built up from one-electron functions or spin-orbitals and that the best 
result would be obtained by applying the variation principle 8{D\j^\Dy = 0 
subject to the constraint <£>|D) = 1, to the many-electron Hamiltonian 
associated with the system. 

It seems to have been generally assumed that, if the system has a certain 
symmetry property explicitly manifested in the solutions to the Hartree- 
Fock equations associated with the variation principle would automatically 
be symmetry-adapted. It was proven by Delbriick (<5) that, if the total system 
is spherically symmetrical and one requires that the total determinant has i S 
character, then the associated Hartree-Fock functions are eigenfunctions of 
the orbital angular momentum and of the spin. In the cases of more general 
types of symmetry occurring in molecular and solid-state theory, it has been 
shown by Roothaan (7) and the author ( 8 ) that the assumption that the 
solutions to the Hartree-Fock equations are symmetry-adapted, i.e., form a 
basis for an irreducible representation, is always self-consistent and corre¬ 
sponds to a specific extreme value of the total energy. The only question is 
whether this extreme value is associated with the absolute minimum of the 
energy or not? 

The question of the character of the extreme values of has been 

studied by Thouless (9) and Adams (10). These values may be maxima, 
minima, or terrace points, and it is somewhat confusing that Adams uses the 
term “absolute minimum” for every point where the second variation of the 
total energy <Jf> is positive definite, whereas one usually reserves this term 
for the lowest of all possible minima. In an interesting study of the He atom, 
in this volume, Coulson has emphasized the saddle-point character of (Jf} 
with respect to orbitals splitting of the closed-shell (Is 2 ) to the form (Is', Is") 
introduced by Hylleraas (//). So far, no one has found any simple criterion 
for the absolute minimum or succeeded in studying its properties. In connec¬ 
tion with the international symposia in Boulder (1958) and in Tokyo (1962), 
several authors expressed the opinion that it seemed extremely plausible that 


The Projected Hartree-Fock Method 


603 


the absolute minimum of the total energy (D\M’\D') found by solving the 
Hartree-Fock equations should correspond to a set of orbitals which are 
necessarily symmetry-adapted, but no proof was given. At the Hylleraas 
symposium on Sanibel Island (1963), the author {12) tried to draw attention 
to the problem again, particularly to the aspects reviewed in this section. 

The problems may be somewhat simplified by studying the Slater deter¬ 
minant D itself, instead of the set of individual spin-orbitals involved. Some 
confusion may arise from the fact that the exact eigenfunction HP and the 
approximate eigenfunction D may have rather different properties. For 
instance, if A is a normal constant of motion satisfying the relations 

A = A J’f, AA f = A f A, (1) 

then every eigenfunction HP to is automatically an eigenfunction to A 
or (in the case of a degenerate energy level) may be chosen in that way, so that 

= A'F = 2HP. (2) 

For the exact eigenfunction, the second eigenvalue relation is hence simply 
a consequence of the first. On the other hand, for the approximate wave 
function D, one replaces in the Hartree-Fock scheme the first eigenvalue 
relation by the variation principle <5<Z)|^f \Dj = 0, subject to the extra 
condition <Z)|Z)) = 1. So far, no one has proven in general that, out of this 
principle, there follows the second equation A D=XD, and this relation 
should then be considered as a constraint which necessarily raises the energy 
above the absolute minimum. The argument is here given for a single constant 
of motion A but will later be extended also to groups G= {g}. 

The first one to notice that the Hartree-Fock scheme and the symmetry 
requirements were not automatically compatible was probably Slater (75) in 
his fundamental study of the connection between the VB- and MO-methods, 
as exemplified by the applications to the hydrogen molecule, in his classical 
paper about cohesion in monovalent metals in 1930, and this fact has later 
been more explicitly stated in several of his papers {14). In studying the energy 
curves as functions of the internuclear separation R, he found that, for suffi¬ 
ciently separated atoms a and b, the single determinant (aoc, bp) has a lower 
energy than the corresponding symmetry-adapted Hartree-Fock solution of 
type {of) 2 depending on the simple fact that the latter has a wrong asymp¬ 
totic behavior leading to ionized states with high energy for R-* oo. 

Another example of a similar type but referring to the equilibrium distance 
itself is provided by the benzene molecule having the symmetry D 6h , where 
recent calculations by Pauncz et al. {15) indicate that there are single deter¬ 
minants associated with the symmetry D 3h which have a considerably lower 
energy than the corresponding determinants of symmetry D 6h . Even if the 
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latter are not exact solutions to the Hartree-Fock equations, the approxi¬ 
mations seem good enough to indicate that the results have definite signifi¬ 
cance. 

It has further been pointed out by Slater (14) that, in systems with un¬ 
balanced spins in which the spin does not explicitly enter the Hamiltonian, 
the electrons with plus spins will be influenced by another exchange potential 
than the electrons with minus spin, since exchange interactions occur only 
between electrons having parallel spins. One could therefore expect that 
electrons with different spins would have different orbitals due to this 
exchange polarization. A simple example of such a system is the 2 S ground 
state of the lithium atom, for which the conventional form (Isa, Is/?, 2sa) 
due to this effect is changed over to (ls'a, Is"/?, 2sa). The exchange polariza¬ 
tion is of essential importance in treating magnetic phenomena (16), but here 
we will only stress the fact that the solution of the Hartree-Fock equations, 
i.e., the variation principle, for open shells leads to approximate wave func¬ 
tions which are no longer exact eigenfunctions to the total spin S 2 . 

It is evident from these examples that the basic symmetry properties of 
approximate wave functions do not automatically follow from the variation 
principle and that a great deal of attention should be devoted to this problem. 
In this situation, it may be worthwhile to distinguish between the various 
types of self-consistent-field schemes more clearly and to emphasize the 
definitions. In the conventional Hartree-Fock scheme, one apparently starts 
out from two basic equations: 

<5<D| JV\Dy = 0, A D=XD, (3) 

and the corresponding minimum could then be said to be A-adapted. It is 
easily shown (5) that, if A is a fundamental symmetric function of the one- 
electron operators A,, A 2 , A 3 ,..., then the spin-orbitals entering D which 
correspond to the energy minimum are automatically eigenfunctions to A y 
or can be chosen in that way. 

On the other hand, if one drops the constraint AD = XD and considers only 
the relation 


5<D\3#’\D}=0, (4) 

one obtains a nonrestricted Hartree-Fock scheme, and the solution D corre¬ 
sponding to the absolute minimum has now usually lost its eigenvalue property 
with respect to A, i.e., the corresponding Hartree-Fock functions are no 
longer “symmetry-adapted.” As a consequence of Slater’s idea about 
exchange polarization, many open-shell systems have now been investigate4 
by this approach (17). 

It is clear that the Hartree-Fock scheme based on a single Slater determi¬ 
nant is in a dilemma with respect to symmetry properties and other constants of 
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motion. If one looks for the absolute minimum of the energy, one loses the 
symmetry properties, and, if one includes the symmetry properties, the 
energy is increased considerably—often as much as 1 eV per electron pair or 
even more. 

Some of the most striking examples of this symmetry dilemma may perhaps 
be found in solid-state theory. In considering a system of free electrons in a 
uniform positive background in a box, it has almost universally been assumed 
that the plane waves would give the essential solution of the Hartree-Fock 
equations. However, by studying, for example, a one-dimensional Fermi gas 
with 5-function repulsions, Overhauser (18) has shown that there exist self- 
consistent solutions in the form of “giant spin waves” having a lower energy 
than the plane-wave state. Such results look paradoxical only if one believes 
that the second relation in Eq. (3) necessarily follows from the first. Some 
examples involving negative atomic ions have also been given recently (19). 

There are several ways out of the symmetry dilemma, but there is probably 
only one way in which one can keep the contact with the independent- 
particle model. The determinant D corresponding to the absolute minimum 
of (D\jF\D)> subject to the condition < D\D} = 1 has lost its fundamental 
symmetry properties. It may be shown, however, that this determinant is now 
a unique sum of components of various symmetry types, and that at least one 
of the components has an even lower energy than D. Such a “component 
analysis” is carried out by means of a set of projection operators O, and, in 
order to proceed, we will now first study their properties. 

III. Component Analysis with Respect to an Arbitrary 
Constant of Motion 

Let A be an arbitrary normal constant of motion satisfying the relations (1). 
Further, let the eigenvalues of A be situated in a finite number of points 
X x , X 2 ,..., X„ in the complex plane each of which may be infinitely degenerate. 
Such an operator satisfies always a reduced Cayley-Hamilton equation of the 
type 

F( A)= fl (A - X k ) = 0. (5) 

k= 1 

As an example, we may consider the exchange operator P x 2 which satisfies 
the relation P\ 2 = 1. It has the eigenvalues ± 1, which are both infinitely 
degenerate, and the reduced Cayley-Hamilton equations takes the form: 

(/\2 + 1)^12 -1) = 0. (6) 

Let us now introduce the reduced characteristic polynomial F(z) = 
U(z - X k ) of degree n in the complex variable z. 
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The associated function 


O k (z) = 


F(z) 


= n 


z Xi 


(z-x k )F'(X k ) 4-V 


(7) 


is a polynomial of degree (n — 1) which has the value 1 for z = X h and the 
value 0 for z = 2, (/ =£ k); it has hence the character of a Lagrange’s inter¬ 
polation polynomial. It is interesting to study the function 


G(z) =1-1 O k (z), (8) 

k=l 

since it is a polynomial of degree (n — 1) having n different zero points: 
z = X x , X 2 ,... X„. The function G(z) is hence identically vanishing for all 
values of z, and one has the simple algebraic identity G(z) = 0, or 


1 = t (9) 

k= 1 

Of fundamental importance in the theory is the operator O k ( A) obtained by 
replacing the complex variable z in Eq. (7) by the operator A: 


O k ( A) = 


n A ~ A| 

l * k X k — X t 



+ 


A -V j 

X k — xj 


( 10 ) 


According to the reduced Cayley-Hamilton equation (5), one had (A - 
X k )O k = F(A)jF'(X k ) = 0, which gives 


A O k — O k A — X k O k . (11) 

This relation shows that O k is an eigenoperator to A associated with the 
eigenvalue X k . By using Eqs. (5) and (9), one obtains further 

O k = O k , O k O t = 0, 1 = X Ok> (12) 

k=l 

which means that the set of operators O u 0 2 ,...,0„ are idempotent, 
mutually exclusive, and form a resolution of the identity. For details of the 
proof, we refer to some previous publications (20). 

Let us now consider an arbitrary trial wave function $ and let us investigate 
whether it may be written as a sum of eigenfunctions to A. Introducing the 
notation = O t <D and using the resolution of the identity in Eq. (12), one 
obtains 

<D = 1 •«> = (£ O k )0> = £ O k O = £ (13) 

* k k 

and further 

A% = AO*<D = X k O k <D = , (14) 

which shows that such a “component analysis” exists. Using the properties 
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(12) of the projection operators, it is also easily shown that the component 
analysis is unique. 

Using the normality of the operator A, i.e., A A f = A f A, one can prove 
that all the operators O k are self-adjoint: 

Ol = O k . (15) 


In combination with Eqs. (1) and (12), this gives 

010, = 5 kl O„ Ol^O, = 0 (*://). (16) 

These relations imply that the functions 0 fc in the component analysis (13) are 
not only orthogonal but also noninteracting with respect to : 

~ 07, 

= <<D|Oi^O t |O> = 0, 

for l^k. By means of the projection operators O k , 0 2 ,..., O n , the entire 
Hilbert space is hence split into n subspaces which are orthogonal and 
noninteracting. 

In studying the trial wave function 0, it is convenient to introduce the 


positive weight factors 

Wk <P|P> 

<0|O fc |O> 

<0|0> ’ 

(18) 

satisfying 

0 < a> k < 1, 

Z <°k = U 

(19) 


k 


and, for a> k ^ 0, the expectation value of the energy with respect to the wave 
function dy. 


<<D fc |^|O fc > <<D|^O fc |Q> 

<o fc |0> fc > <0|O fc |0>. 


Using Eqs. (13) and (17), one finds particularly 


<^>* = 


<Pijn<E) _E<<D fc i^iP t ) 

<p|p> <p|p> 


<p fc i^r|p fc > 

k <pi<j>> 


— Z W k$k > 


( 21 ) 


i.e., the expectation value is an average value of all quantities 

6\, <f 2 ,..., <f„ with positive weights. Unless all these quantities are the same, 
there exists at least one quantity $ k which is lower than and the com¬ 

ponent analysis is hence a valuable tool for lowering the energy. It may be 
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applied, for instance, in studying the properties of the covalent band (21) 
where it gives the energy stabilization connected with the exchange operator 
A = P l2 - The component analysis has further turned out to be of particular 
value in studying the properties of angular momenta (22) and their eigen¬ 
functions. 


IV. Projection Operators and Component Analysis for 
Finite Groups 


Let us now consider the case when all normal constants of motion form a 
finite group G={g} of order |(7| = n. In order to deal with the group, it is 
convenient to introduce the invariant mean over the group: 


which fulfills the relations 


m/m = _L e/m. 

s |(7| s 


( 22 ) 


M/Cs) = M/C s' 1 ) = M f(gs) = ••• etc. (23) 


The “group algebra” is the linear space consisting of all elements formed by 
linear combinations (“addition”) of the elements of the group multiplied by 
complex coefficients. Let us consider two arbitrary elements A and B defined 
by the relations 


A = M a(^)5 x , 

S 

B= M P(s)s~ l , 


(24) 


Their product is defined by the distributive law and the group multiplication 
i.e., Ok = L a k b igkgn and using (23) one obtains: 

AB= M M <x(s)(3(t)s~ 1 t~ 1 

s t 


= M M <x(s)P(us l )u 1 

5 U 

= M [MaCs)£(ttf _ 1 )]w _1 

u s 


(25) 


= M y(u)u l . 

U 


One says that A and B correspond to the functions a and /? over the group, 
respectively, and that the product AB corresponds to a new function y which 
is the convolution product of a and j? denoted by 

y — a * 


(26a) 
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and defined by the relation 

y(u) = Ma(s)p(us~ 1 ), (26b) 

according to Eq. (25). We note that there is a complete isomorphism between 
the group algebra and the functions over the group, and that the products in 
the former correspond to convolution products in the latter, and vice versa. 

Let V be a subspace of order/of the group algebra which is stable under all 
operations of the group, so that gV belongs to V , and let the set X = 
(X t , X 2 ,..., X f ) form a basis for the subspace. According to the definition, 
one has 

sJ, = £x,r m, (27) 

k 


and to each element g there is hence associated a matrix T(g) = {r ft ,(g)}, so 
that gX = XF(g). This gives further 

ghX = gXT(h) = XT(g)m = mgh) (28) 

and 

T(g)T(h) = Y(gh). (29) 

Since the matrices T have the same multiplication table as the elements of the 
group, they are said to form a representation of the group. Every stable sub¬ 
space V defines a representation. The trace of T is said to be the character of 
the representation: 

xGr) = Tr{r(g)} = £{rte)}„. (30) 

k 

A stable subspace V is said to be irreducible, if there exists no proper sub¬ 
space of V which is also stable under the group operations; the associated 
representation is then called an irreducible representation. Let us now con¬ 
sider two irreducible subspaces V a and V p of order/, and f p respectively, with 
the associated representations T a and T p , and also let A be a linear mapping 
of Vp on V a associated with the rectangular matrix A. The operator corre¬ 
sponding to the matrix 

T = Mr/^AT^- 1 ) (31) 


is a linear operator which maps V p into V a , and it satisfies the fundamental 
relation 

TT/g) = T a (g)T, (32) 

for all g in the group. According to Schur’s famous lemma, one has either 
T = 0 or T _1 exists; in the latter case, the representations T a and T p are 
equivalent. It is now convenient to introduce the symbol: 


8 




0, if T a and T p are nonequivalent, 
1, if T a and are identical, 


(33) 
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which means that one essentially excludes the case when T, and are 
equivalent but not identical. For the case when V a = V p , T, = T p , a second 
application of Schur’s lemma gives T = 2 1, where 1 is the identity operation 
in V a . Hence one obtains: 

t = nr a (s)Arp(s~ 1 ) = s aP -A-i a . (34) 

s 

Formation of the trace gives Tr {A} for a = />’, i.e., 

M r„(s)AI>- ') =/.- 1 S„ Tr{A} •1„. (35) 

s 

The substitution A = AT(g) gives further 

Mr„(s)AT,(gs-') =/r‘ 3„ t Tr{A'-r(g)}-l„. (36) 

S 

Let us take the (k, /)-element of this relation; since the matrix A' = {A’ mn } 
is completely arbitrary, the coefficients of A' mn of both sides must be identical, 
which gives 

M {r a (j)} JkIB {r,(g.s" , )} llI =f ~ 1 S a pS kl {Tp(g)} nm , (37) 

s 

or in terms of the convolution notation (26): 

{r«U * {r,} Bl =f- i s ap s kl {r p } nm . (38) 

Putting in = k and summing over k, one obtains 

(39) 

and, putting n = / and summing over I, one gets further 

Xa*Xp=foT 1 ^apXp, (40) 

which is the fundamental convolution relation for the characters of the irre¬ 
ducible representations; it contains the standard orthogonality relations for 
the characters for g = e, where e is the neutral element. 

Of essential importance in the quantum-mechanical applications are the 
elements {P a } km of the group algebra associated with the functions / a {r a } fcm 
through the relation: 

{P.h.=f. M{r„(j)} lmJ -‘. (4i) 

S 

Starting from Eqs. (25) and (26) and using the convolution product (38), one 
obtains the product relation 

{Pa)km{Pp}nl = & afield {P p] nm ■ 


(42) 
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It is evident that the quantities {P a } kk form a set of projection operators which 
are idempotent and mutually exclusive. From the properties of the regular 
representation, one obtains 


Z/aX«(#) 


|Gj for g = e 

0 for g ^ e 


(43) 


and, since the first relation implies that 7^/„ 2 = |C7|, they are often referred to 
as the completeness relations. By means of them, one obtains directly 


EE (44) 


i.e., the projection operators { P a } kk form a “resolution of the identity.” 

If the elements g of the group G represent symmetry operations which 
leave the length of the vectors invariant, so that ||gO|| = ||0||, one has 

||g<F|| 2 = <g<F |gO> = <<F|gV|0>> = <<F|0>> (45) 

for all <F, i.e., g'g —gg^ = e. This relation implies that the elements of the 
symmetry group G are unitary with respect to the metric used in quantum 
mechanics. If the representation T a is further chosen to be unitary, one 
obtains 

{P.} k . = {P.)L ■ (46) 


We note particularly that the projection operators {P a } kk are self-adjoint. 

By using Eq. (42), it is easily shown that the operators {P a } km are linearly 
independent and, since the total number of operatorsis Z«/a 2 = |G|, they span 
the entire group algebra of order |Gj. From the definition (41), it follows that 

g{P a } km = f a M{r a (s)}kmgS- 1 

s 

= f. M {T.{rg)) k J-' = I {>>.}„, {r„(£)},„, (47) 

r Z 

i.e., the operators {P a } km of a row (m = 1, 2, ...,/ a ) form a stable subspace 
connected with the irreducible representation T a . 

Let us now consider an arbitrary trial wave function and the associated 
functions g<F. It is further convenient to introduce the functions 

= {/’.),=/. M {r.(5))„,, s - ■«>, (48) 

and to arrange these functions in matrices: 


<T) a 


<h)3 • 

4 . <T) a 

^1 fa 

®21 

®*22 

®S 3 * 

.4 <T) a 

^2 fa 


<*>% 2 

3 * 

.4 (f> a 

^ fee fa J 


<j) a = 


(49) 
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It should be observed that, according to Eq. (42), functions in the same row 
may be obtained from each other by means of a shift-operator: 

<!>L={P*} nm %n- (50) 

This implies that the functions in the same row are either all vanishing or all 
nonvanishing; according to Eqs. (42) and (46), they have actually the same 
norm: 

= <<%!%>• ( 51 ) 

It follows further from Eq. (47) that the functions <h£ m in a row do transform 
according to the irreducible representation T a . 

By combining the relations (42) and (46), one obtains the operator formulas 

{JUt* {P,}nl = Kl>Skl {*»}-• (52) 

{PaYmk^f {Pfi)nl ~ &{Pp}nmi (53) 

which are of fundamental importance in the quantum-mechanical applica¬ 
tions. As a consequence, one obtains directly the relations 

= 8 afi 8 kl <0|d0, (54) 

<<UL\snK> = KAi (55) 

which show that functions associated with different irreducible representations 
or with different columns within the same irreducible representation are not 
only orthogonal but also noninteracting with respect to . 

Let us now turn to the question of “component analysis.” Using Eq. (44) 
one obtains 

<t = ^= [££{/>„},,14> = £4>;», (56) 

L a ft J aft 

i.e., a resolution of in terms of functions <I>£ ft of “diagonal” character. To 
every nonvanishing <S>l k , one may construct a complete row by means of the 
shift operators according to Eq. (50), and one obtains a set of orthogonal 
functions (w = 1, 2,.../„) which transforms according to the irreducible 

representation T a . It follows from (55) that these functions have not only the 
same norm but also the same energy: 

<<Dftj^i4>D = = <%k\jn%k>- (57) 

The relation (56) gives hence a component analysis of O completely analogous 
to (13), and the expectation value of with respect to O is thus a weighted 
mean of the energies associated with the various components in accordance 
with Eq. (21). Even in group theory, the component analysis is hence an 
important tool to lower the energy. 
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The situation may be further improved by considering the various columns 
in (49). Let us for a moment assume that the trial wave function $ has no 
symmetry properties whatsoever and that the functions g<t> form a linearly 
independent set of order |G|; the same applies then to the functions <f>£ m . 
The functions in different columns of (49) are orthogonal and noninteracting 
and we will now construct the secular equation associated with one specific 
column, say the fcth one. According to Eq. (55), one has 


- £|<D“ fc > = <0>|- £|<D“ m >, (58) 

i.e., the matrix elements are independent of k. This implies that every column 
leads to one and the same “block” in the secular equation, that every block 
will be repeated f a times, and that every root E will hence be repeated at least 
f a times leading to an energy degeneracy characterized by the order of the 
irreducible representation r a . It seems as if the characteristic polynomial in z 
given by the secular determinant 

\<K k \^ - z|<D“ fc >| = \f a M {r a (5)}„ m <0> \jT -z\s-'Q>}\ (59) 

would depend explicitly on the matrix elements of T a , but Byers-Brown has 
shown in an interesting paper in this volume that it depends only on the 
characters x a • If the function O is in any way symmetry-adapted, the functions 
<I>“ fc in a column may be linearly dependent, and it is then necessary to 
construct an orthonormal set before setting up the secular equation which 
otherwise becomes identically vanishing for all values of z. 

The component analysis in Eq. (56) is used also in studying the properties 
of the exact wave function satisfying J'T'F = £'E. One has 

'•'-E'Pl,, ( 60 ) 

otk 

and, if a specific term is nonvanishing, one may construct a full row by 
means of the shift operators according to (50): 

^L= {Pa} km ^ kk = (61) 

These functions are all nonvanishing and orthogonal, and they form a basis 
for the irreducible representation r a . One has further 

=^{P a } km ^ = {■ P a } km XV 

(62) 

= E{P a } km '¥ = E'¥* km , 

i.e., the functions T£ m are hence all exact eigenfunctions to the Hamiltonian 
34? associated with the eigenvalue E, and the order of the degeneracy is at 
least f a . This is the fundamental theorem in the applications of group theory 
to quantum mechanics. 
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The group algebra ideas in this section go back to Wedderburn, Schur, 
Frobenius, and Young. For a survey of the applications to quantum theory, 
the reader is referred to the classical books by Weyl, Wigner, and van der 
Waerden (23). 

We have here discussed only finite groups, but many of the ideas based on 
the concept of the “invariant mean” defined by Eq. (22) may be extended to 
compact infinite groups. It should perhaps also be observed that the projec¬ 
tion operators for the rotation group and the translation group may be 
obtained from the simple product form (10); see particularly reference (22). 

V. The Projected Hartree-Fock Scheme 

The projection operators O are valuable tools in discussing the constants of 
motion for both single operators A and groups G = {g}. Using the compo¬ 
nent analysis based on the resolution of the identity, for these quantities, one 
can prove that every exact eigenfunction ¥ either automatically fulfills the 
relations 

OT = T, (63) 

or may be written as a unique sum of components satisfying these relations. 
We note that for a single constant of motion A, the projection operator O is 
given by the product form (10), whereas, for a group G = {g}, it is given by the 
diagonal operator {P a } kk associated with an irreducible representation T a 
defined by Eq. (41). 

In the Hartree-Fock scheme based on a single normalized Slater determi¬ 
nant, D, the Schrodinger equation is replaced by the variation principle, and 
the conventional Hartree-Fock method is actually based on the two relations 

5(D\3#’\D') = 0, OD — D. (64) 

Even if these relations are analogous to Eq. (63), the second equation is 
certainly a constraint which is going to raise the energy. In order to improve 
the wave function, it seems hence desirable to remove the constraint and to 
look for the absolute minimum of (D\Jf\D), which leads to the nonrestricted 
Hartree-Fock scheme discussed in a previous section. The Slater determinant 
D associated with the absolute minimum is usually not an eigenfunction to 
the constants of motion, but the results obtained in the previous sections 
show that it may be written in the form 


D = ^D k , 

k 

D k = O k D, 

case of A 

(65) 

D = Y. Dl„ 

ak 

Dl k = {P a } kk D, 

case of G = {g} 

(66) 


and that this component analysis is unique, that the components are orthog¬ 
onal and noninteracting, and that at least one of the components has a 
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lower energy than D itself. This implies that, by selecting a proper component 
OD, one may both restore the symmetry properties and lower the energy. 
A further lowering of the energy may even be possible by a more detailed 
study of the specific component OD, if one minimizes the energy with respect 
to this wave function rather than with respect to D. This line of thinking leads 
to the projected Hartree-Fock scheme suggested in 1954 by the author (24). 

In using the variation principle to solve the Schrodinger equation, the 
Eckart criterion (25) tells us that the accuracy of the trial wave function is 
improved as the energy is lowered, and this implies that one cannot obtain 
particularly good results in the conventional Hartree-Fock scheme, even if 
they may be valuable and important from qualitative points of view. In the 
nonrestricted Hartree-Fock scheme, one is lowering the energy in a more 
effective way, but one has instead lost track of the symmetry properties and 
the normal constants of motion. One natural way out of this dilemma is to 
approximate the wave functions T by a proper projection of a single 
determinant : 

T ^ OD, 

where O = O k (A) for a single constant of motion A, and 0 = {P a } kk 
group G = {^}, and to apply the variation principle to the expression 

/^\_ <ODW\ OD> <D\O^Q\Dj (D\yfO\Dj 

<< OD\ODj (D\0'0\Dj ~ < D\0\Dj. ' 

The wave function OD is in a sense actually the same as in the conventional 
Hartree-Fock scheme, only that we have removed the constraint OD = D in 
Eq. (64). It is evident that the wave function OD associated with the absolute 
minimum of (68) is no longer going to be a single Slater determinant. 
However, even if we have consequently departed from the main idea of the 
Hartree-Fock scheme, the wave function OD is still associated with the 
independent-particle model in the sense that it is uniquely defined by a 
Hartree-product through the component analysis. 

In this connection, it is helpful to reconsider how the Hartree-Fock scheme 
was introduced by Slater (34). The basis for the independent-particle model 
of an 7V-particle system was originally the Hartree-product: 

^lC*l)'J / 2(A2)'J / 3C* : 3) n( x n)> (69) 

where x k = (r k , C*) is the combined space-spin coordinate. If the particles are 
fermions, they have to satisfy the Pauli principle, and the associated anti¬ 
symmetry property may be introduced by using the antisymmetric component 
of Eq. (69) selected by the projection operator 

^s = (^')- 1 E(-l) p P, 


Slater 

(67) 
for a 

( 68 ) 


( 70 ) 
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of type (41) for the antisymmetric representation of the symmetry group S N . 
This projection changes the Hartree-product (69) into a Slater determi¬ 
nant D, and it should be observed that this projection usually increases the 
energy depending on the fact that fermions (in contrast to bosons) cannot be 
packed into the bottom of the energy-level scheme more tightly than one 
fermion per spin-orbital. A little mixture of “bosonic” character into the 
wave function may hence lower the energy considerably but, for fermions, 
such a procedure is certainly highly improper, and the projection operator, 
Eq. (70), is essential in formulating the theory. 

The basic idea of the projected Hartree-Fock scheme is to extend the pro¬ 
jection to cover not only the antisymmetry property but all symmetry pro¬ 
perties of the wave function associated with the normal constants of motion, 
and the wave function OD may hence be considered as the proper symmetry 
projection of a single Hartree product (69) associated with the independent- 
particle model. It should be observed that this projection is unique, that the 
other members D a km may be obtained from the component D a kk by means of the 
shift operators: 

Dl m = {P.)M = {P.} km D, (71) 

and that all these functions have the same energy. 

One common objection against the projected Hartree-Fock scheme is that 
the wave function OD is no longer a single Slater determinant but rather a sum 
of such determinants, and that the scheme is hence closer to the method of 
superposition of configuration than to the independent-particle model. This is 
true in the same sense as a Slater determinant is the sum of N\ Hartree 
products, but it should be observed that the wave functions D and OD are 
both uniquely defined by a Hartree product (69), i.e., by a set of N spin- 
orbitals 'Tj, '¥ 2 , ...TV, whereas the standard method of superposition of 
configurations is based on a set of spin-orbitals having an order M which is 
larger than the number of particles : M > N. The fact that the wave function 
OD is uniquely defined by exactly N spin-orbitals is of 

essential importance both for the physical interpretations as to the connection 
with the independent-particle model and for the simplicity of the mathemati¬ 
cal calculations in minimizing the energy (68). We note that the wave function 
OD should be considered as a conceptual entity in the same way as D previously 
was, and that expansions in terms of Hartree products or Slater determinants 
are not necessarily the best clue for understanding the fundamental properties 
of OD. 

As an example of this principle, we will consider an arbitrary nonsingular 
transformation of the basic spin-orbitals: 

~ Tj - 

l 


( 72 ) 
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As pointed out by Slater, one has the theorem 

det TO*,)} = det TO*,.)} • det {a lk }, 

i.e., D = D * det {a lk }, which says that the determinantal wave function is 
going to be changed only by a constant factor. This theorem is of fundamental 
importance in the Hartree-Fock scheme, since the basic spin-orbitals without 
loss of generality may be chosen orthonormal, so that <4' fc |4',> = 5 kl . Since 
the projection operators O defined in the previous sections are linear opera¬ 
tors, one obtains directly 


OD' = (< OD ) det {a lk }, (73) 

i.e., even in the projected Hartree-Fock scheme the wave function OD is 
changed only by a constant factor under the transformation (72). Even in 
this scheme, the basic spin-orbitals may hence be chosen orthonormal without 
loss of generality. 

It is evident from Eq. (73) that the wave function OD does not depend on 
the choice of individual spin-orbitals but on the linear space M N spanned 
by these spin-orbitals which is characterized by the projection operator (26): 

p = |'J'X'F|'F>- 1 OF|, (74) 

where 'F={4' 1 , 'F 2 ,..., 4^}. One has the fundamental relations 

P 2 = P, P f = P, Tr(p) = N. (75) 

If the set 4* is chosen orthonormal, one has =1 and further 

P = I'PX'FI = t IV.X'F.I. (76) 

k = 1 

If one looks at the (x l5 x 2 )-component of this operator, one obtains 

p(*i. * 2 ) = i 'VMWM, (77) 

k= 1 

which shows that the kernel of the projection operator p is identical with the 
Fock-Dirac density matrix (26). Since the wave function OD depends 
uniquely on p, the main problem in the projected Hartree-Fock scheme is to 
vary p so that the energy (68) becomes an absolute minimum. 

In the same way as the variational problem in the nonrestricted Hartree- 
Fock scheme leads to a reduced eigenvalue problem in the one-electron space: 

^ M W'V l (x l )=e t 'V k (x 1 ), (78) 

involving an effective Hamiltonian, Pf ef( (l), which depends only on p, the 
same is true also in the projected Hartree-Fock scheme. Introducing the 
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notation = <^>, one obtains by varying the energy expression (68): 

<5 ~&)\ 0D > + complex conjugate} = 0 (79) 

which gives 

(5D\(Jf - jp)0| D) = 0. (80) . 

This implies that the resulting equations will be exactly the same as those 
obtained in the ordinary nonrestricted Hartree-Fock scheme if one replaces 
the original Hamiltonian J4? by the composite Hamiltonian (3tf — 3^)0, 
which contains both the projection operator O and the final expectation value 
2? of the energy. The composite Hamiltonian is usually a many-body Hamil¬ 
tonian but, in a previous paper (27), the author has shown that the ordinary 
Hartree-Fock theory may be extended to include also such Hamiltonians and 
that the resulting effective Hamiltonian depends only on the quantity p defined 
by Eq. (77). 

These extended Hartree-Fock equations were suggested in 1954, but they 
have so far not been solved “exactly” for any particular system. In this 
connection, it may be worthwhile to remember that the ordinary Hartree- 
Fock equation for the absolute minimum of the nonrestricted scheme sug¬ 
gested much earlier have not been solved “exactly” for even the simplest 
atomic systems and that much work remains to be done in this field. Instead 
one has to be satisfied with numerical approximations which have been 
successively refined as the capacity and speed of the modern electronic com¬ 
puters have been increased. One of the most successful approximations is 
based on the ASP-MO-LCAO-SCF idea (28), in which the basic spin-orbitals 
'F k are expanded in terms of a truncated set {<£„} of order M: 

M 

V k = Z ( 81 ) 

M=1 

Introducing the charge- and bond-order matrix R = cc\ where 

N 

E Vl (82) 

fc= 1 

one obtains p = |'F><4'| = |0>cc t <<^| = |0>R<0| and 

M 

p( Xl ,x 2 )= £ <t> tl (x 1 )R^<f>*(x 2 ), (83) 

H, v = 1 

which shows that the fundamental invariant p may be approximately described 
by the discrete matrix R of order M x M. The main problem is now to vary R 
so that the energy (68) becomes an absolute minimum, and, in practice, this 
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procedure may be carried out as described in a previous paper (28) dealing 
with the Hartree-Fock method for many-body Hamiltonians. 

The projected Hartree-Fock scheme is conceptually simple, but the mathe¬ 
matics involved looks rather complicated because we are not familiar with it. 
In the discussion, we will indicate that, for electronic systems, the method 
seems to give good results—with a relative accuracy of better than 5 x 10~ 4 
in the total energy—but we don’t want to give the impression that such 
numerical results can be obtained without a considerable amount of work 
and computation. The best way to gain a deeper understanding of the features 
of the projected Hartree-Fock scheme probably is to calculate the first- and 
second-order reduced density matrixes associated with the wave function 
OD which depend only on the quantity p. The first-order density matrix is 
actually the clue for obtaining the extended Hartree-Fock equations for the 
projected scheme (78) in explicit form. The first-order density matrix for 
spin-projected determinants OD has been found by Harriman (29), and 
further work on this problem is in progress. 

In conclusion, it should be observed that we are here dealing with spin- 
orbitals, i.e., one-electron functions of the form 

'F(x) = T(r, C) = T + (r)a(C) + T_(r)/?(0- (84) 

It is not yet known whether such general spin-orbitals are of importance in 
treating systems where the spin does not enter the Hamiltonian explicitly, but 
they are of fundamental importance as soon as spin-orbit couplings or spin- 
spin couplings are introduced in the Hamiltonian. In such a scheme, the 
fundamental invariant p has four space components p+ + , p+-, p~ + , P--, 
rather than two. 


VI. The Correlation Problem 

A study of the independent-particle model would be incomplete without 
a discussion of the correlation problem. Since in the independent-particle 
model, each particle moves in the outer field and in the “average-field” of all 
the other particles, one neglects the correlation between their motions 
depending on the fact that they may strongly repel each other due to Coulomb 
repulsion and hard-core interaction. In this section, we will essentially discuss 
the correlation problem in electronic systems, but the arguments may be 
extended to general systems of fermions. 

For electrons, the correlation problem is associated with the Coulomb 
repulsion e 2 /r l2 which creates a “Coulomb hole” around each electron with 
respect to all other electrons. 

For electrons with parallel spins, the exchange interations will create a 
“ Fermi hole ” which will take care also of the main part of the Coulomb hole, 
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so the essential part of the correlation problem deals with pair of electrons 
having antiparallel spins. The problem is complicated by the fact that, depend¬ 
ing on the classical formulation of the Pauli principle, one assigns, in the 
conventional Hartree-Fock scheme, two electrons with antiparallel spins, a and 
P, to each orbital available. This pairing of the electrons is essential also for 
constructing determinants D of correct symmetry, but, since the two electrons 
are forced to stay in one and the same orbital, the arrangement gives rise to 
a considerable correlation error. 

In order to get an idea of the order of magnitude of this error, the correla¬ 
tion energy is introduced (30) as the difference between the exact energy of the 
Hamiltonian and the energy of the conventional Hartree-Fock scheme: 

•^corr = ^exact — ^HF • 0^) 

A simple study of the two-electron systems (30), the He-like ions and the 
hydrogen molecule, indicates that the correlation errors amount to about 
— 1.1 eV of which +1.1 eV refers to the kinetic energy and —2.2 eV to the 
potential energy according to the virial theorem. The correlation energy per 
electron pair is not a constant, and it goes up considerably with increasing 
atomic number (31). As a rule of thumb, the correlation energy is approxi¬ 
mately 1 % of the total energy. 

One may ask how much the correlation error would go down, if one released 
the “pairing constraint” and let the electrons with different spins go into 
different orbitals, to avoid each other, “if they so desire.” The first example 
of this type was treated by Hylleraas (11) and involved the splitting of the 
(ls 2 )-shell for the helium atom into the form (Is', Is") with a remarkable 
energy lowering as a result. A second example was given by Slater (75) in his 
study of the monovalent metals in which he showed that the energy curves for 
separated atoms could be lowered enough to obtain their correct asymptotic 
form by permitting electrons with different spin to be on different sublattices. 
In the nonrestricted Hartree-Fock scheme, one can hence remove a large part 
of the correlation error simply by permitting “different orbitals for different 
spins” (DODS). 

The problem of treating the symmetry properties in this connection is 
solved in the projected Hartree-Fock scheme, which permits also a further 
lowering of the total energy. The simple applications carried out to atomic 
and molecular systems, so far, indicate that about 95% of the correlation 
energy can be removed in this way, and the remaining error in the total 
energy should then be about 0.05%, which is perhaps sufficient to explain the 
main qualitative features of the systems under consideration. 

A special case of the method of different orbitals for different spins in the 
projected Hartree-Fock scheme is the alternant-molecular-orbital (AMO) 
method suggested by the author (32) in 1953. In this method, electrons 
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having different spins in alternant systems may accumulate on different 
subsystems, and their separation may be regulated by a single variable 
parameter S which is determined by the variation principle applied to Eq. (68), 
or by using one parameter S k per electron pair. The method has now been 
extensively applied to conjugated systems (33), and the correlation problem 
for the infinite linear chain has been treated in this way by Pauncz and de 
Heer (34). It is presently being used for a study of the cohesive properties of 
the alkali metals by Calais (35). More applications of the general method of 
different orbitals for different spins in the projected Hartree-Fock method to 
atoms and small molecules are in progress or are being planned in the Uppsala 
and Florida projects. 

The success of the independent-particle model in treating atoms, molecules, 
solids, and particularly nuclei has often been hard to explain, and it was 
suggested by Brueckner (36) that it may depend essentially on another form of 
the self-consistent-field scheme based on the effective two-body part of the so- 
called reaction operator. The author (37) has shown that this approach may 
be refined to an exact self-consistent-field scheme by considering the full 
reaction operator. It is my opinion that, at least for electrons, one may remove 
the essential part of the correlation effects simply by a proper treatment of the 
symmetry properties, and this leads to a close connection between the shell 
structure and the constants of motion underlying the independent-particle 
model. For electrons, only the last 0.05 % of the total energy would require 
studies of the reaction operator through infinite-order perturbation theory 
(33), whereas, for atomic nuclei, the situation may be considerably more 
complicated. 
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Electron-electron repulsion, 66 


Electron lone pair, 67 
Electron temperature, 174 
Electron transfer, 303 
Elliptical orbitals, 137, 140 
Emission lines, 170 

Energy band calculations, 361, 381, 429, 
445, 465, 469, 497 

empirical adjustment of first-principles 
band calculations, 384 ff, 423 ff. 
empirical extended zone, k • p band cal¬ 
culations, 392 

empirical pseudopotential band calcula¬ 
tions, 389,393,412 
relativistic corrections, 386 
self-consistent energy band calculations, 
384 

spin-orbit coupling corrections, 385, 

389, 392 

Energy bonding power, 232 
Energy surfaces for multicenter systems, 
281 

Equilibrium bonding criteria, 232 
Equivalence restriction, 158 
Equivalent orbitals, 253 
Ethylene, 228 

Exact self-consistent-field scheme, 621 

Exchange polarization, 160, 164, 604 

Exchange splitting, 488 

Excitation potential, 175 

Excited molecules, 301 

Exciton 

lifetime, 568 
size, 568 

Exclusion principle, 88 
Exclusive orbitals, 253-262 
Exponential power series, 149 
Extended basis set MO’s, 221 
Extended Hartree-Fock equations, 618 

F 

F 2 , 222 
Fe - Al, 493 
Fe - Co, 493 
Fe - Cr, 493 
Fe — Ni, 493 
Fe - V, 493 

Fermi contact term, 120, 158 
Fermi energies, 487 
Fermi hole, 619 
Ferromagnetism, 25, 475, 487 
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Fluorescence, 169 

Foldy-Wouthuysen transformation, 95 
Forbidden transition, 171, 175 
Force constant, 284 
Formyl fluoride, 228 
Four-dimensional crystal classes, 571 
Free energy, functional of, 325 
Free valence number, 296 

G 

Gauge function, 324 
Gaunt correction factor, 174 
Gaussian-type functions, 44, 227 
Generalized symmetries, 582 
Germanium, 393-409 

deformation potential studies, 401-405 
electroreflectance studies, 395 
energy band model, 393-399 
extreme pressure studies, 405-409 
silicon alloys, 420-423 
g-Factor tensors, 433 
Giant quantum oscillation, 441 ff. 

Giant spin waves, 605 
Gray group, 582 
Green’s functions, 448, 469, 497 
Grey tin, 388-392 

energy band model, 388-391 
spin-orbit splitting, 389, 392 
Group algebra, 608 
Group theory, 123 
Groups of antisymmetry, 571 

H 

H 2 , 221 

FFO, localized molecular orbitals, 264 
Hamiltonian 

dipolar interaction, 119 
effective, 113 
Fermi contact, 118 
SCF, 74 
Hartree-Fock 

approximation, 97, 158, 601, 604, 614, 
620 

atomic orbitals, 44 
equation, 111 
He 2 , 134 ff. 

He 2 + , 134 

Heisenberg model, 491 
Helicons, 439 

dispersion relation, 440 


Helium, 206, 208 ff, 220 
atom, 97, 136, 308 
Heterocyclics, conjugated, 335 
Hilbert space, 101 
Hiickel, extended — theory, 43 
Hybridization, 59, 74 
Hydrogen fluoride, 222, 225 ff. 

Hydrogen molecule, 21, 311 
Hypervirial theorems, 220 

I 

Impedance field method, 537 
Independent particle model, 307, 601, 615 
Induced current density, 323 
Inner-shell excitations, 315 
Innocent ligands, 312 
Inorganic chromophores, 312, 315 
Integrals, three- and four-center, 69 
Interaction, interatomic, 84, 133 ff. 
intermolecular, 81 
nonlocal, 203 ff. 

Interelectronic repulsion, 307, 310, 312, 
314 

Intermetallic compound AlFe 3 , 303 
Interstellar matter, 167 
Intraatomic Coulomb interaction, 485 
Intrinsic dissociation energy, 233, 235 
Invariant mean, 608 
Irreducible representations, 123 

character and reality conditions for, 583 
Isovalent hybridization, 234 
Itinerant electron ferromagnetism, 487 

J 

Jahn-Teller effect for multivalley semi¬ 
conductors, 436 
Josephson effect, 529 
Josephson tunnel junction, 525 

K 

Kjeldaas edge, 440, 441 
Kohn-Sham exchange approximation, 400 
Koopman’s correction, 386 ff. 
Kramers-Kronig relation, 440, 442 
Krypton, 203, 205, 208, 211 

L 

Laguerre function series, 149 
Landau levels, 440 
Lanthanide compounds, 312 
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LCAMO, 236 
LCAPAS, 234 
Li, 158 
Lio, 222 

Lieb-Mattis theorem, 487 
LiF, 222 

Ligand field theory, 312 
LiH, 222 

Linear magnetic response, 323 
Linear metal, 491 
Localization energy, 295 
Localized adjustment functions, 253-262 
Localized orbitals, 253-263 
in benzene, 277 
in C 2 , 274 

in CH 4 and C 2 H 6 , 268 
in H 2 O, 264 
in NH 3 , 266 

equivalent orbitals, and, 271 
s-p hybridization, and, 271 
Lorentz group, 94 

Lowdin’s expansion techniques, 365 
Lyman series, 172 

M 

Madelung energy, 312 
Magnetic hyperfine structure, 157 
Magnetic moments, 303 
Magnetization, 324 
Magnetoplasma oscillations, 439 
Mass-velocity correction, 431 
Matrix representations, 124 
Mean value of the k\h power of the radius, 
220 

Metallic bonding, 313 
Metallic state, 465 
Metals, 17 fL, 485 
Metastable levels, 175 
Minimal basis set MO’s, 221 
Minimum spin, 480 
Molar refractivity, 219 
Molecular crystals, 566 
Molecular diagrams, 295 
Molecular field, 486 
Molecular fragment models, 335 
Molecular Hartree-Fock orbitals, 223 
Molecular orbital minus ionic states 
method, 75 

Molecular orbitals, 217, 221, 231, 243 
bonding powers, 239 


effect of D on R, 239 
localized, 263 
Momentum operator, 118 
Monovalent metals, 17 ff. 

“Muffin tin” potential, 365 
Mulliken approximation, 71 

N 

Na, 158 
No, 222 , 238 

NH 3 , localized molecular orbitals, 266 
Ni - Cu, 493 
Ni - Pd, 493 

Natural spin orbitals, 111, 311 
Near-degeneracy of orbitals, 308 
Nebulae, 167 
Nebulium lines, 172 
Neon, 208 ff. 

Nephelauxetic effect, 314 
Newtonian dynamics, 93 
Noise 

calculation, 537 
quantum, 587-599 
Non-orthogonality problem, 34 
“Non-pairing” method, 77 
Nonrestricted Hartree-Fock scheme, 157, 
604,614, 620 

Nuclear-shell quantum numbers, 314 

O 

Octahedral chromophores, 312 
One-electron energies, 315 
Optimal gauge, 327 
OPW scheme, 434 

Orbital exponents, 134, 136 ff., 139 ff. 
Oscillator orbitals, 253-262 
Oxidation states, 312 

P 

Pair correlation, 164 

Pairing theorems, 114 

Paramagnetic current density, 331 

Paramagnetic response, 319 

Paramagnetism, 327 

Para or ortholocalization energy, 295 

Partition functional, 325 

Pauli matrices, 120 

Pauli principle, 81, 89, 615 

Permutations, 82 

Phase shift method, 446 
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Phase shift operator 5, 447 
Phase shift singularities, 460 
Phase transition, 466 
Photochemistry, 295 
Photocyclization of butadiene, 296 
Photodimerization, 299 
of acenaphtylene, 299 
of thymine, 300 

Photoelectron spectroscopy, 315 
Photohydrolyze of nitrophenyl ethers, 297 
Photoionization, 169, 174 
Photooxidation, 299 

Phototransformation of base in acid, 298 
Piezoresistance, 437 
Poisson brackets, 93 

Polar gray and black-white groups by pro¬ 
jection, 571 
Polar group, 582 
Polariton, 567 
Polarizability, 207 ff. 

Polarization, 203 ff. 
field, 203 ff. 
functions, 228 
long-range, 203 
potential, 206 
Polyatomic molecules, 42 
Population, 108 
Positrons, 203 
Potential barrier, 295 
Potential curves, 283 
H- molecule, 288 
Ha + molecule, 291 
He-H-H-system, 288 
Potential double minimum, 53 
Probability amplitudes, 86 
Projected Hartree-Fock method, 601, 

614 ff. 

Projection operator, 87, 607 ff. 

group theoretical, 611 
Proton affinities, 60 
Pseudo-correlation, 236 
Pseudo-crystal classes, 572 
Pseudopotential method, 367 

Q 

Quadrupole moments, 60 
Quantum noise, 587-599 

R 

Racah's parameters, 309 


Racah-Trees corrections, 309 
Ramsauer-Townsend effect, 203 ff., 208 ff. 
Rare-earth metals 

atomic structures, 367 
electrical resistivity, 375 
Fermi surface, 371, 376 
magnetization, 373 
Rate constant, 295 
Rate of annihilation, 205 
Reaction operator, 621 
Recombination, 174 

Recombination spectrum, hydrogen, 177 
“Reduced internuclear distance,” 232 
Relative magnetization at 0°K, 487 
Relativistic effects, band structure, 430 
Relaxation, 587-599 
Relevant symmetry, 311 
Representation, 123 ff., 609 
Resolution of identity, 606 
Response kernel, 320 
Roothaan, MO-SCF, 44, 244 
Rosseland’s theorem, 171 
Rotational barriers, 65 

S 

Scalar product for operators, 321, 331 
Scattering, 203 

Scattering amplitude, 449, 453 
Scattering operator, 447 
Schur’s lemma, 609 
Screening constants, 218 ff. 

Second-order density matrix, 310 
Secular equations, 123 
Self-consistent fields, 185 ff. 
Semiconductors, 367, 537 
Seniority number, 309 
Shift-operator, 612 
Silicon, 409-420 

deformation potential studies, 402 ff., 
415 ff. 

electroreflectance studies, 418 ff. 
energy band model, 410-413 
photoemission studies, 416 ff. 
piezoreflectance studies, 413 ff. 

Single particle energy, 489 
Single particle excitations, 486 
Slater determinant, 105 ff., 297, 602 
Slater exchange approximation, 384, 400 
Slater orbitals, 217 ff. 

Slaters parameters, 309 
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Small angle neutron scattering technique, 
487 

Spectral class method, 173 
Spectroscopic MO’s, 231 
Spin operators, 489 
Spin-orbit interaction, 185, 389, 430 
Spin-pairing energy, 309 
Spin-polarized Hartree-Fock, 158 
Spin waves, 27, 485 
Spinning particle, 93 
Static indexes, 296 
Statistical matrix p, 86 
Stilbene, 299 
Stoner criterion, 486 
Superconductors, 511 
Superexchange interaction energy, 485 
Superfluids, 511 
Symmetry, 123 
dilemma, 602 

elements, projection of, 583 
properties, 616 
restriction, 158 

T 

Temkin-Lamkin procedure, 207 
Thermo-chemical bonding criteria, 232 
Three-particle systems, 147 
Tight binding method, 446 
Time-dependent stress on crystals, 435 
Transfer integral, one electron, 476 
Transition metals, 497 
Transition probabilities, 177 


Transition-state theory, 295 
Transport of energy, 569 
Two-electron substitutions, 308, 310 

U 

Unrestricted Hartree-Fock, 157, 604, 614 
Li, N, Na, and P, 159 

y 

“Vacuum state,” 106 

Valence bandwave function, 435 

Valence bond calculations, 46, 140 

Valence state, 65 

Van der Waals forces, 53 

Van Hove singularities, 459 

Variation principle, 602 

Virtual positronium formation, 209 

Vortex lines, 518 

Vortex motion, 530 

W 

Walsh’s diagrams, 61 
Walsh’s rules, 69 
Wannier functions, 462 
Water, 228 
Watson’s effect, 310 
“Whistlers,” 439 

X 

Xenon, 203 

X-Ray absorption, 313,315 
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